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INTRODUCTION 


Statistical  models  are  as  the  name  implies  only  models  of  reality. 
As  such,  they  only  approximate  the  true  process  and  in  general,  more 
than  one  model  can  be  used  to  describe  a  given  set  of  data.  Under 
these  conditions,  the  experimenter  must  determine  models  which 
adequately  describe  the  observed  data.  Depending  on  how  the  model  is 
to  be  used,  one  of  the  candidate  models  is  chosen.  In  this  thesis,  the 
term  model  refers  to  the  structural  and  distributional  description  of 
the  data.  The  nominal  distributional  model  is  the  normal  distribution; 
the  structural  models  are  parametric  models  such  as  linear  regression. 
This  thesis  examines  model -critical  procedures  which  scrutinize  the 
data  and  the  assumed  model  by  varying  the  way  the  data  is  processed 
during  model  fitting.  Some  aspects  of  model-critical  analysis  have 
been  presented  by  Presser  (1980),  Paulson,  Presser  and  Lawrence  (1981), 
Paulson  and  Delaney  (1981),  Paulson,  Presser  and  Nicklin  (1982), 

Paulson  and  Delehanty  (1983),  and  Paulson  and  Swope  (1987).  All  but 
the  last  reference  use  the  term  self-critical  in  place  of  model- 
critical.  Since  we  are  criticizing  a  model,  the  term  model -c ri t ica 1 
is  adopted  here. 

Model-critical  analysis  is  based  on  the  generalized  likelihood 
(Paulson  and  Delehanty,  1983)  for  a  random  sample  x^  ,  x,,,  ...  xn 


where 


< 


a  =  c/(l+c) , 

f(xi;e)  is  the  assumed  probability  density  of  the  data 

evaluated  at  observation  x., 

i 


Q(o) 


fUc(xi;e) 


dx 


(1.2) 


is  the  information  generating  function  for  f(x;9)  (Golumb,  1966,  and 
Paulson  and  Delehanty,  1983),  and  c  is  the  model-critical  parameter. 
The  model-critical  estimate  for  e  is  the  value  of  0(c)  which  maximizes 
(1.1).  The  estimate  9(c)  is  a  robust  estimate  for  0  with  the  degree 
of  robustness  controlled  by  c.  It  will  be  shown  in  Part  2  that 
L  (e)  is  the  usual  log  likelihood  for  9  and  that  9(o)  is  the  maximum 
likelihood  estimate  of  0.  Differentiating  L  (0)  with  respect  to  © 
and  setting  the  result  equal  to  zero  yields 


3bc(0)  ^  f  (Xi;  9) 

3log  f(xi;9)  j 

— 

1  \  3loq  0(0) 

39  h  »a<s> 

39  \1- 

ec )  39 

=  0  (1.3) 


^  which  is  a  necessary  condition  that  must  be  satisfied  by  9(c).  For 

t 

<  the  models  discussed  in  Part  2,  equation  (1.3)  is  used  to  obtain  9(c). 


«  •  *  * .  •  .  *  «  - 


From  (1.3)  it  can  be  seen  that  each  term  in  the  sum  is  weighted  by  the 
t  h 

c  power  of  the  assumed  probability  density  of  the  data  evaluated 
at  x...  The  value  of  c  determines  the  amount  and  type  of  weighting 
used  in  (1.3).  Positive  values  of  c  downweight  outlying  observations 
and  negative  values  of  c  downweight  inlying  observations.  This 
weighting  produces  a  criticism  of  the  data  and  the  assumed  model 
f ( x ;B)  that  can  indicate  if  any  model  assumptions  h  ve  been  violated. 

Outliers  are  an  example  of  a  violation  of  the  model  assumptions  on 
the  data.  It  is  noted  that  an  observation  is  an  outlier  only  with 
respect  to  the  assumed  underlying  model;  if  a  different  model  is  used, 
the  observation  may  no  longer  be  an  outlier.  Unlike  unstructured  data, 
where  an  outlier  "sticks  out",  the  structure  of  a  model  can  hide  the 
outlier.  If  multiple  outliers  are  present,  they  can  compensate  each 
other  (Barnett  and  Lewis,  1978,  Chapter  7).  In  time  series  where  the 
observations  are  not  independent,  outliers  need  only  be  large  with 
respect  to  the  error  process  to  seriously  affect  the  parameter 
estimates  (Kleiner,  Martin,  and  Thomson,  1979).  With  outliers  of  this 
magnitude,  they  may  not  show  up  in  plots  of  the  data.  Fox  (1972) 
considers  two  outlier  models  for  time  series.  The  additive  outlier  is 
a  gross  error  at  a  single  observation.  The  innovations  outlier  is  a 
large  value  in  the  error  process  due  to  a  heavy  tailed  error 
distribution.  It  is  noted  that  both  types  of  outliers  can  occur  with 
independent  as  well  as  dependent  observations.  With  dependent- 
observations ,  the  innovative  outlier  will  affect  subsequent 


observations  due  to  the  correlation  between  observations.  A  host  of 
robust  or  resistant  procedures  are  available  to  reduce  the  effect  of 
outliers  on  the  parameter  estimates  (See  Andrews  et  aj_. ,  1972,  and 
Martin  and  Thomson,  1982).  The  previous  discussion  has  focused  on 
outlying  contamination;  however,  inlying  or  short-tailed  contamination 
can  also  be  a  concern  (Hogg,  1974). 

From  the  above  discussion,  it  can  be  seen  that  robustness  and 
goodness  of  fit  are  related.  Model-critical  analysis  uses  this 
relationship  to  examine  models  for  a  given  set  of  data.  The  analysis 
compares  maximum  likelihood  and  robust  parameter  estimates.  Clearly, 
the  robust  and  maximum  likelihood  estimates  must  estimate  the  same 
quantity  if  they  are  to  be  comparable.  From  the  derivation  of  Lc(e), 
the  estimate  8(c)  is  a  consistent  estimate  of  8  (Delehanty,  1983). 

Thus,  e(c)  and  8(o)  are  two  consistent  estimates  of  8.  If  the  data  and 
the  assumed  model  are  internally  consistent,  then  e(o)  and  e(c)  should 
be  approximately  equal  over  a  range  of  c  values.  However,  if  the  data 
and  the  assumed  model  are  not  consistent,  then  ©(c)  will  change 
considerably  as  c  increases.  Large  changes  in  parameter  estimates 
8(c)  indicate  that  the  model  requires  closer  examination.  As  an 
M-estimator,  robust  weights  are  obtained  as  part  of  the  estimation 
process.  For  c  *  0,  these  critical  weights  can  be  used  to  flag 
questionable  observations .  Examination  of  the  weights  aids  the  analyst 
in  evaluating  the  model.  For  example,  small  weights  indicate  outlying 
contaminat ion  when  c  >  0.  Although  effective  for  analyzing  models. 


the  above  procedure  is  quite  subjective.  In  order  to  make  critical 
analysis  more  precise,  Oelehanty  (1983)  and  Hwang  (1984)  have  presented 
goodness  of  fit  statistics  to  test  for  multivariate  normality  that 
compare  the  maximum  likelihood  and  model-critical  estimate  of  the 
covariance  matrix.  These  test  statistics  like  other  tests  for 
Gaussianity  were  developed  for  data  without  structure  other  than  a  mean 
vector  and  covariance  matrix.  Since  most  data  analysis  involves  models 
with  additional  structure,  it  would  be  desirable  to  have  a  test  of  fit 
which  can  be  applied  to  structured  as  well  as  unstructured  models.  In 
this  way,  the  test  could  be  applied  to  the  residuals  of  a  structured 
model.  Gentleman  and  Wilk  (1975)  have  applied  the  Shapiro-Wilk  test 


(Shapiro  and  Wilk,  1965)  to  two-way  layout  models;  however,  before 
using  the  test,  percentage  points  of  the  statistic  had  to  be  tabulated 
for  the  particular  model.  This  is  undesirable  especially  if  a  number 
of  different  models  are  to  be  examined.  Ideally  one  would  like  a  test 
that  can  be  developed  for  unstructured  data  and  also  be  applicable  to 
structured  data. 


Part  2  discusses  mode  1 -c r i t i ca 1  estimation  and  presents  procedures 
to  obtain  model-critical  estimates  for  linear  regression, 
autoregression  and  two  way  layout  models;  some  complementary  material 
can  be  found  in  Oelehanty  (1983).  Also,  a  model -critical  selection 
procedure  is  presented;  it  is  a  general i zat ion  of  the  selection 
criteria  of  Akaike  (1974),  and  Hannan  and  Quinn  (1979)  which  are 
special  cases  of  the  mode  1 -c ri ti ca 1  procedure.  For  data  contaminated 
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parameter  c.  The  ability  to  apply  the  test  to  structured  as  well  as 
unstructured  models  makes  the  test  unique  among  tests  for  normality 
which  can  only  be  applied  to  unstructured  data.  Part  6  applies  the 
estimation,  model  selection  and  test  procedures  to  experimental  data. 


The  following  are  some  notation  conventions.  Lower  case  letters 
denote  vectors  and  capital  letters  denote  matrices.  Since  scalars  are 
a  special  type  of  vector  or  matrix,  both  lower  and  upper  case  letters 


will  be  used  to  denote  scalars;  in  general,  lower  case  letters  will 
denote  scalars.  In  general,  the  vector  of  parameters  ©will  not  be 
included  in  the  arguments  of  a  probability  density;  for  example,  f(x) 
will  denote  f(x;e).  The  estimates  of  a  parameter  e,  for  example,  will 

A 

be  denoted  by  e,  0(o),  or  0(c),  where  0(o)  and  ©(c)  denote  the  maximum 
likelihood  and  model -cri ti ca 1  estimates,  respectively. 
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PART  2 

MODEL-CRITICAL  PROCEDURES 


2 . 1  Introduction 

Model -critical  procedures  provide  the  analyst  with  a  means  to 
analyze  a  proposed  model  for  a  set  of  observed  data.  Since  there  is 
usually  more  than  one  model  which  can  be  used  to  describe  a  set  of 
data,  the  procedures  allow  for  the  selection  of  a  model  using  a 
model -c ri t i ca 1  analogue  of  the  Akaike  selection  criterion.  If  the 
data  are  contaminated  with  outliers  for  example,  the  model -critical 
selection  criterion  will  select  the  model  which  describes  the  bulk  of 
the  data.  Section  2.2  defines  the  generalized  likelihood  function 
which  is  used  to  obtain  model -critical  parameter  estimates.  Section 
2.3  presents  model -critica 1  estimation  procedures  for  a  number  of 
widely  used  models;  some  complementary  material  can  be  found  in 
Delehanty  (1983).  Section  2.4  derives  the  model-critical  selection 
criterion  which  is  analyzed  in  Section  2.5  using  simulated 
autoregressive  processes  with  and  without  outliers. 


.3 


2 . 2  Generalized  likelihood 

Let  the  p  x  1  vectors  x^ ,  x^,  ....  xn  constitute  a  random 

sample  from  the  p-variate  Gaussian  distribution  denoted  N  (m,D)  with 

P 

mean  vector  m  and  covariance  matrix  D,  and  with  probability  density 


4 


>i 


f  ( x )  =  1 2irD  |  ^exp[-  (x  -  m)^D  '(x  -  m)/2 


(2.1) 


s' 

» 


ft 


m 


*  iT>  **>  ri 


Wivivi 


The  information  generating  function  of  f(x)  is  defined  by  (Golumb, 
1966;  Paulson  and  Oelehanty,  1983) 


* 

i.D.c)  =  fC(x)  f 


(x)  dx 


(2.2) 


for  c  contained  in  some  nondegenerate  neighborhood  of  c  =  0.  The 
expression  of  (2.2)  can  be  explicitly  and  directly  evaluated  as 


Q(m.D.c)  =  [  |2irD|  C ( 1  -t-c ) P ]  1/2  ,  c  >  -  1  . 


(2.3) 


All  mutual  self-information  quantities  can  be  obtained  directly  from 
(2.3),  e.g.,  the  entropy  of  f(x)  is  given  by 


-  Qc(m,D,0)  -  (p  +  log  |2ir0|  )/2 


(2.4) 


where  0c(m,0,0)  represents  the  first  partial  derivative  of  0  with 
respect  to  c  and  c  set  to  0. 


The  generalized  likelihood  for  m  and  D  given  the  density  (2.1)  and 
the  random  sample  x, ,  x^,  ....  x^  is  (Paulson  and  Delehanty, 

1983) 


Lc(m,0)  =  (1/c)  ^  [fC(xi)/Q*(m,D,c)  -  l] 
i  =1 


*  (2-5] 


1», 


where  c  >  -  1  and  Q*(m,D,c)  =  Q(m,D,c)C/  1 fC  .  It  is  easily  shown  by 
expansion  or  by  L'Hospital's  rule  that 


lim  Lc(m,D)  =  ^  log  f (xi )  =  LQ( m, I 


is  the  usual  log  likelihood.  The  model-critical  estimates  for  m  and  0 
are  the  values  m(c)  and  0(c)  which  maximize  (2.5).  The  estimates  m(c) 
and  D(c)  are  the  solutions  to  the  system  of  equations 


^  =  y!ir( 
36  f^oaL 


(1  +  c) 


alog  f. 


(2.6) 


for  e  =  m  and  D,  and  where  the  arguments  of  f.  =  f(x^)  and  Q  have 
been  suppressed  for  notational  convenience.  Each  term  in  (2.6)  is 
weighted  by  the  cth  power  of  f(x.),  and  this  affects  the 

estimation  of  m  and  0  by  downweighting  terms  in  (2.6)  correspond i ng  to 

c  .  . 

small  values  of  f . .  Using  (2.6)  the  following  set  of  implicit 

estimation  equations  for  m  and  D  are  obtained. 


-Evi 


(2.7a 
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n 

D  =  (1+c)  ^w.(x.  -  m)(x.  -  m)1  (2.7b) 

i=l 

where 

n 

wk  ■  fk‘v/EfC(xi)  <2-,c) 

i=l 


The  estimates  for  m  and  0  are  the  values  m(c)  and  0(c)  which  satisfy 
system  (2.7).  Clearly,  when  c  =  0,  m(0)  and  D(0)  are  the  usual  maximum 
likelihood  estimates.  The  specification  of  the  user-provided  constant 
c  is  based  on  sample  size  n,  dimension  p,  and  the  character  of  the 
sample.  Part  5  discusses  appropriate  values  of  c;  an  additional 
discussion  can  be  found  in  Paulson  and  Delehanty  (1983).  Equations 
(2.7)  define  a  family  of  estimates  for  m  and  D  indexed  on  c,  m(c)  and 
D(c),  with  m(0)  and  0(0)  the  maximum  likelihood  estimates  for  m  and  D. 


When  c  *  0,  the  weighting  w.  is  determined  from  the  assumed 

multivariate  density  and  the  data  x,,  x„,  ...,  x  .  Data  that  are 

1  2  n 

not  consistent  with  the  multivariate  Gaussian  assumption  will  receive 
small  weights  w. .  This  affects  the  estimates  m(c)  and  0(c)  by 
downweighting  data  which  are  not  consistent  with  the  model.  It  is 
noted  that  all  the  data  are  used  in  the  estimation  process,  the 
influence  of  each  observation  on  the  estimates  m(c)  and  0(c)  being 
determined  by  its  weight  w. .  If  the  data  and  the  multivariate 
Gaussian  model  are  internally  consistent,  then  m(c)  and  0(c)  w i.  1 1  be 
approximately  equal  to  m(0)  and  0(0).  For  c  *  0,  the  procedure 
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estimates  the  parameters  m  and  D  for  the  most  Gaussian-1 ike  cluster  in 
the  data.  For  c  >  0,  outlying  observations  will  be  downweighted  and, 
for  c  <  0,  inlying  observations  will  be  downweighted.  The  choice  of  c 
determines  how  the  data  is  processed. 


As  an  illustration,  generalized  likelihood  parameter  estimates  are 
presented  for  a  bivariate  sample  taken  from  Anderson  (1984,  p.  97). 
Table  2.1  lists  the  25  observations  from  Anderson  plus  5  outliers 
which  have  been  appended.  A  scatter  plot  of  the  data  is  shown  in 
Figure  2.1  where  an  x  signifies  one  of  the  original  25  observations 
and  a  y  signifies  an  outlier.  The  parameter  estimates  m(c)  and  D(c) 
for  c  =  0  (maximum  likelihood),  0.1,  0.2,  0.3,  and  0.4  are  shown  in 
Table  2.2.  As  c  increases  from  0  to  0.4,  the  estimated  mean  vector 
changes  little,  whereas,  the  covariance  structure  changes  considerably. 
The  estimated  correlation  coefficient  increases  from  0.45  to  0.88  and 
the  estimated  standard  deviations  0^  =  D^2  and  decrease 

from  12.4  and  9.3  to  9.1  and  5.5,  respectively.  For  c  =  0.4,  the 
parameter  estimates  are  closer  to  the  estimates  from  the  uncontaminated 
data  than  are  the  maximum  likelihood  estimates. 


For  c  =  0.1,  0.2,  0.3,  and  0.4,  Table  2.1  shows  the  unnormalized 
weights  w-  =  exp(-  c(x.  -  m(c))T0(c)  \x-  -  m(  c ) ) )  of  each 
observation;  all  the  weights  are  one  for  c  =  0.  For  observations  25  to 
30,  the  correspond i ng  weights  decrease  rapidly  as  c  increases  and  only 
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observation  29  and  30  have  weights  different  from  zero  when  c  =  0.3. 

The  five  appended  outliers  are  perceived  as  not  being  consistent  with 
the  original  25  observations  and  the  assumed  single  Gaussian 
population.  The  estimation  procedure  clusters  the  data  in  the  sense 
that  the  original  25  observations  are  retained  as  a  single  population 
and  the  outliers  are  more  or  less  ignored. 

2.3  Models  With  Additional  Structure 

In  this  section,  models  with  structure  in  addition  to  a  mean 
vector  and  covariance  matrix  are  examined.  That  is,  the  observations 
are  of  the  form 

y.  =  h(x.;e)  +  c.  (3.1) 

where 

y..  is  a  p  x  1  vector  of  observations, 

x.  is  a  q  x  1  vector  of  concomitant  variables, 
i 

e  is  a  q  x  1  vector  of  parameters  to  be  estimated  from  the  data, 
and 

c ^  is  a  p  x  1  vector  of  errors. 

The  errors  c..  are  assumed  to  be  independent  and  identically 
distributed  Gaussian  random  variables  with  zero  mean  and  covariance 
matrix  0.  The  model  h(x^;e)  =  m  was  examined  in  Section  2.2. 
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The  probability  density  of  the  errors  is 
f(Ci)  =  1 2-n-D 

-1/2 


"1/2exp(-e.D"1£i/2) 


=  1 2-trO |  ' /2exp( -  (y.  -  h( Xi  ;0) )  D-1  ( y .  -  h(x.  ;©)  )/2)  .  (3.2) 
The  generalized  likelihood  without  the  constant  term  denoted  L(c)  is 


L(c)  = 


i  i 

i=l 


1+c 

2ir0 
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exp(-c(yi-h(x.;e)),D  (y. -h(x. ;©) )/2)  (3.3) 


where 

a  =  0.5c/(l+c) . 

For  the  proposed  model  h(x;6)  in  L(c),  the  model-critical  estimates  of 
e  and  0  are  obtained  by  maximizing  (3.3)  over  e  and  D.  For  many 
models,  setting  equal  to  zero  the  derivatives  of  L ( c )  with  respect  to  e 
and  D  yields  a  set  of  implicit  equations  which  can  be  solved  via  a 
fixed  point  algorithm.  Autoregressive-moving  average  (ARMA)  models 
cannot  be  solved  via  a  fixed  point  algorithm.  Since  ARMA  models 
require  a  different  estimation  procedure,  they  are  discussed  separately 
in  Parts  3  and  4. 

2.3.1  Linear  Regression 

For  multivariate  linear  regression,  h(x.;A)  =  Ax.  and  (3.1) 


becomes 


Llt.ClAlUl.r 


lT 


*» 


For  the  regression  model,  L(c)  is  obtained  by  substituting  Ax.  for 
h(x  ;e)  in  (3.3).  Setting  equal  to  zero  the  derivatives  of  L(c)  with 
respect  to  A  and  D,  the  model -critical  estimation  equations  for  A  and  D 


ET  V  T 

w.y.x.  >  w.x.x. 

iJi  i  /  j  ^  i  i 


=  (l+c)/w.  y;  w.(y.-A(c)x.)(y.-A(c)x. 


(3.5) 


(3.6) 


where 


w.  =  exp(-  c(y.  -  A(c)x.)TD(c)  1(yi  -  A(c)x.)/2) 


(3.7) 


W,  = 


As  an  example,  consider  the  abrasion  resistance  of  rubber  data 
(Suich  and  Derringer,  1977)  in  Table  2.3.  The  model  considered  is 


yi  ao  f  alxl  ¥  a2Xl  ¥  a3X2  +  a4x2  f  a5XlX2  f  ci  ‘ 


(3.8) 
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c 

ao 

a1 

a2 

a3 

a4 

a5 

s 

-0.1 

97.6 

5.80 

-0.17 

6.09 

-3.80 

2.86 

3.36 

0.0 

97.7 

5.87 

-0.11 

6.04 

-3.88 

2.82 

3.43 

0.1 

97.8 

5.96 

-0.01 

5.97 

-4.00 

2.78 

3.46 

0.2 

98.0 

6.09 

0.15 

5.90 

-4.19 

2.72 

3.43 

0.3 

98.4 

6.47 

0.64 

5.80 

-4.70 

2.66 

3.14 

0.4 

98.9 

7.07 

1  .26 

5.83 

-5.26 

2.72 

2.55 

The  independent  variables  are  x^  ,  silica  filler  level,  and  x^, 

coupling  agent  level.  Table  2.4  presents  the  maximum  likelihood  and 

model-critical  estimates  for  c  =  -0.1,  0.0,  0.1,  0.2,  0.3,  and  0.4. 

Over  the  range  of  c  values,  the  coefficients  a^ ,  a2>  and  a4  change 

2 

considerably;  ,  a2>  and  a^  are  the  coefficients  of  x^ ,  x^ , 

2 

and  x2>  respectively.  The  critical  weights  shown  in  Table  2.3 
indicate  that  the  data  for  observations  1  and  10  should  be  examined 
further  which  we  will  do  in  Part  6. 

2.3.2  Autoregression 

For  the  multivariate  autoregressive  (AR)  model  of  order  m,  (3.1) 
can  be  expressed  as 

m 

*i  =  Xa  yi-k  +  ci  (3-9) 

k=l 

for  i  =  m-t-1  ,  m-t-2,  ...,  n.  The  model -c  ri  t  ica  1  estimates  for  D  and  A^, 
k=l ,  2,  ....  m  are  obtained  by  differentiating  L(c)  with  respect  to 
D  and  A^,  k  =  1 ,  2,  . . . ,  m;  setting  the  derivatives  equal  to  zero;  and 
solving  for  D  and  A^ ,  k=l ,  2,  . . . ,  m.  The  model-critical  estimation 


equations  are 


A(c)  =  [A-|  ( c ) ,  A2(c),  ....  Am(  c )  ] 


The  rs  entry  of  C  is 


=  S  '  w.y.  yT 
/  -ri-ni-s 


i=m+l 


for  r,  s  =  1,  2,  . . . ,  m,  the  r  entry  of  b  is 


Ew.y .y1 

iJ  i  -r 


i  =m  +-  ] 


where 


(3.12) 


(3.13) 


(3.14) 


w.  =  exp(  -c(y .  -  ^  Ai(c)yi_k)TD(c)  ](y.  -  A.(c)y._k) 


(3.15) 


r 
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w-  -  2  ui  • 


i=m+l 


The  estimates  A(0),  0(0)  and  A(c),  0(c),  c  *  0  are  conditional  maximum 
likelihood  and  conditional  model-critical  estimates  since  they  are 
conditioned  on  the  first  m  observations;  however,  the  estimates  will 
still  be  referred  to  as  maximum  likelihood  or  model -critical 
estimates.  A  procedure  to  estimate  the  full  maximum  likelihood  and 
model -cri tica 1  estimates  for  ARMA  models  will  be  presented  in  Parts  3 
and  4. 


As  an  illustration,  model-critical  estimates  are  presented  for  a 
simulated  Gaussian  AR ( 4 )  process  (ARMA(4,0))  with  representation 


x^  =  2.0625xtl  -  2.4325xt_2  +  1 . 5845x^_3  -  0.652xt  ^ 


where  the  e  are  independent  identically  distributed  with  zero  mean 
and  unit  variance.  Figure  2.2  is  a  plot  of  the  realization  used  to 
the  obtain  parameter  estimates.  Table  2.5  contains  the  parameter 
estimates  for  c  =  0,  0.1,  0.2,  0.3,  and  0.4.  It  can  be  seen  that  the 
mode  1 -c r i t i ca I  and  maximum  likelihood  estimates  are  approximately 
equal.  Next,  additive  outliers  were  added  to  four  observations 
selected  at  random  in  the  realization  shown  in  Figure  2.2.  The 
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outliers  are  independent,  identically  distributed  Gaussian  random 
variables  with  zero  mean  and  variance  2;  also,  the  outliers  are 
independent  of  x^.  The  additive  outlier  model  (Martin,  1981)  is 
given  by  y^.  =  x^.  +•  v^  where  x^.  is  the  AR(4)  process  and  v^. 
is  the  outlier.  For  this  example,  v^  *  0  for  only  four 
observations.  Figure  2.3  is  a  plot  of  the  data  in  Figure  2.2  with 
outliers  added  to  four  observations;  without  a  priori  knowledge,  one 
would  not  suspect  that  outliers  are  present.  Martin  (1981)  notes  that 
for  time  series,  the  outliers  need  only  be  large  relative  to  the 
innovations  process  to  seriously  affect  the  parameter  estimates. 

With  outliers  of  this  magnitude,  they  may  not  stand  out  as  they  do 
when  the  observations  are  independent.  This  is  clearly  seen  in  Table 
2.6,  which  presents  the  maximum  likelihood  and  model-critical  parameter 
estimates  for  the  data  in  Figure  2.3.  As  c  increases,  the 
model-critical  estimates  approach  the  true  values;  the  irprovement  in 
the  estimates  follows  from  the  downweighting  of  the  outliers  in 
model-critical  estimation.  For  the  data  with  outliers.  Figure  2.4 
contains  plots  of  the  model -critical  and  maximum  likelihood  spectrum. 
The  critical  spectrum  is  closer  to  the  true  spectrum  (the  solid  line) 
than  the  maximum  likelihood  spectrum  estimate.  For  the  data  with 
outliers,  the  maximum  likelihood  spectrum  contains  more  high  frequency 
components  than  the  critical  spectrum.  This  is  the  case  because 
maximum  likelihood  estimation  fits  all  the  data,  whereas  critical 
estimation  fits  the  bulk  of  the  data  without  outliers. 


TABLE  2.5 

Maximum  Likelihood  (c  =  o)  and  Model-Critical  (c  *  o)  Parameter 
Estimates  for  a  Simulated  Univaraite  AR(4)  Process  with 
True  Parameters  oq  =  2.0625,  c*2  =  -2.4325, 

<13  =  1.5845,  014  =  -0.652,  and 
o=l;  Sample  Size  =  100 


c 

al 

a2 

a3 

a4 

S2 

0.0 

2.1346 

-2.5685 

1.7113 

-0.7092 

0.9100 

0.1 

2.1190 

-2.5438 

1 .6906 

-0.7094 

0.9137 

0.2 

2.0984 

-2.5040 

1.6527 

-0.6981 

0.8915 

0.3 

2.0841 

-2.4820 

1  .6353 

-0.6999 

0.8663 

0.4 

2.0631 

-2.4400 

1  .5962 

-0.6914 

0.8241 

TA8LE  2.6 

Maximum  Likelihood  (c  =  o)  and  Model-Critical  (c  *  o)  Parameter 
Estimates  for  a  Simulated  Univaraite  AR(4)  Process  with 
Four  Additive  Outliers;  True  Parameters  a-|  =  2.0625, 
a 2  =  -2.4325,  c»3  =  1.5845,  04  =  -0.652,  and 
o=l;  Sample  Size  =  100 


c 

al 

a2 

a3 

a4 

S2 

0.0 

1 .7661 

-1  .7643 

0.9633 

-0.4409 

2.0344 

0.1 

1 .8477 

-1.9116 

1.0799 

-0.4651 

1.7752 

0.2 

1 .9556 

-2.1520 

1 .2968 

-0.5410 

1.4180 

0.3 

2.0342 

-2.3064 

1 .4354 

-0.5912 

1 .1620 

0.4 

2.0391 

-2.3498 

1 .4765 

-0.6102 

1.0740 
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For  c  =  0.4,  Figure  2.5  is  a  plot  of  the  model -critical  weights 


v — v  2  2 

wt  exp( -  c(xt  -  y  /k(c)xt-k)  /2s  (c)' 


(3.16) 


where  $  (c).and  ak(c)  are  calculated  using  (3.10)  and  (3.11) 


Since  the  critical  weights  are  a  measure  of  fit  between  the  data  and 


the  model,  analysis  of  the  weights  is  an  integral  part  of  the  modeling 


process.  In  Figure  2.5,  the  small  weights  at  observations  5,  6,  11, 


and  25  indicate  that  the  model  which  describes  the  bulk  of  the  data 


does  not  give  a  good  fit  to  these  observations.  In  fact,  some  of  the 


subsequent  weights  are  small  due  to  the  dependence  between 


observations.  Small  weights  alert  the  experimenter  to  the  fact  that 


further  analysis  of  the  data  and  model  may  be  necessary. 


2.3.3  Factorial  Designs 


Model-critical  estimates  for  analysis  of  variance  models  are 


presented  here.  Since  the  notation  of  the  general  analysis  of 


variance  model  becomes  tedious,  the  multivariate  two-way  layout  with 


interaction  will  be  used  to  illustrate  mode  1 -c ri t i ca 1  procedures  for 


factorial  designs.  The  model  is 


y...  =  y  +  a .  +8.+"y..+c... 
ljk  1  J  1J  ijk 


(3.17) 


for  i  =  1 ,  2,  . . . ,  I ;  j  =  1 ,  2 . J;  and 


k  =  2 . nij 
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As  with  the  previous  models,  a  system  of  equations  can  be  obtained  by 
substituting  (3.17)  into  (3.3)  and  differentiating  with  respect  to  y. 

By  y.  for  i  =  1,  2,  ....  T;  j  =  1,  2,  ...,3.  The  system 
of  equations  obtained  by  setting  the  above  derivatives  equal  to  zero 
is  not  of  full  rank.  The  following  constraint  equations  will  yield  a 
full  rank  system  of  equations. 


£  V.j.  ■ 0 


)  y- -w. .  =0  for  all  j  , 

L-J  ij. 


Zy- -w. .  =0  for  alii, 

ij  ij . 


where  the  dot  indicates  summation  over  a  subscript.  With  the  above 
constraints,  the  following  recursion  relations  for  the  estimation  of 
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B 


v(c)  =  Z  E  E  yijkuijk/u  " 
i  j  k 


(3.20) 


R(c)  -  (l*c)/w...  eijk  e1jk 

i  j  k 

where 

e...  =  y...  -  y ( c )  -  a . ( c )  -  B.(c)  -  y..(c) 
t j k  Jijk  '  v  '  j  ij 

and 

"ijk  ■  exi)<- c  eljkR(c)"'eijk/2)  • 

Unless  indicated  specifically,  all  summations  are  over  the  range  of 
the  indicated  subscripts.  A  fixed  point  algorithm  is  used  to  obtain 
the  parameter  estimates.  Where  parameter  estimates  appear  explicitly 
on  the  right  side  of  a  relation,  the  current  estimate  is  used.  The 
weights  are  updated  after  all  the  effects  and  error  covariance  matrix 
are  estimated. 

As  an  example,  parameter  estimates  are  obtained  for  a  univariate 
two-way  layout  with  interaction.  Table  2.7  presents  survival  time  data 
which  are  taken  from  Box  and  Cox  (1964).  The  parameter  estimates  are 
obtained  using  the  system  of  equations  (3.19)  and  (3.20).  Table  2.8 
presents  the  maximum  likelihood  and  model-critical  parameter  estimates. 
Almost  all  the  parameter  estimates  change  as  c  increases  from  0  to  0.3. 
For  c  =  0.3,  the  model -critical  weights  arc  presented  in  Table  2.9. 

The  small  weights  in  cells  (1,2,2)  (i.e.,  poison  1,  treatment  2,  and 
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TABLE  2.7 


Survival  Time  Data  for  Three  Poisons  and  Four  Treatments 


Poi son 


Treatment 


1 

2 

3 

4 

0.31 

0.82 

0.43 

0.45 

0.45 

1.10 

0.45 

0.71 

0.46 

0.88 

0.63 

0.66 

0.43 

0.72 

0.76 

0.62 

0.36 

0.92 

0.44 

0.56 

0.29 

0.61 

0.35 

1  .02 

0.40 

0.49 

0.31 

0.71 

0.23 

1.24 

0.40 

0.38 

0.22 

0.30 

0.23 

0.30 

0.21 

0.37 

0.25 

0.36 

0.18 

0.38 

0.24 

0.31 

0.23 

0.29 

0.22 

0.33 

6 
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TABLE  2.8 


Maximum  Likelihood  (c  =  0)  and  Model -Critical  (c  *  0) 
Parameter  Estimates  for  the  Survival  Data 


c  =  0 

C  =  0.1 

c  =  0.3 

u 

0.479 

0.461 

0.412 

°n 

0.138 

0.148 

0.168 

<*2 

0 . 6  50 ( -1  ) 

0 . 53  5 ( -1  ) 

0. 300 ( - 1  ) 

“3 

-0.203 

-0.190 

-0.146 

01 

-0.165 

-0.151 

-0.110 

02 

0.197 

0.186 

0.133 

s3 

-0.869( -1 ) 

-0 . 744 ( -1  ) 

-0.565 

04 

0 . 548( -1 ) 

0 . 590 ( -1 ) 

0 . 770( -1 ) 

Til 

-0.3981-1 ) 

— 0 . 460 (  -1 ) 

-0 . 493( -1 ) 

TT21 

-0. 592 ( -T ) 

— 0 . 443 ( — 1 ) 

— 0 . 1 08 ( —1 ) 

"T31 

0.989( -1 ) 

0 . 89  2 ( — T ) 

0 . 539 ( -1  ) 

0 . 6  5  2  (  -1  ) 

0 . 778( -1 ) 

0.107 

Y22 

0.7  33(  —1  ) 

0.646 

-0 . 2  3  0 ( -1  ) 

T"32 

-0.138 

-0.122 

— 0 . 642 ( — 1 ) 

T|3 

0 . 369 ( — 1 ) 

0 . 288 ( — 1 ) 

—0 . 2 59 (  -1  ) 

Y23 

-0 . 825( -1 ) 

-0 . 6  56 ( — 1  ) 

— 0 . 1 08 ( — 1 ) 

Y33 

0 . 4  5  6  ( -1  ) 

0.380(-l) 

0. 257( -1 ) 

Y14 

-0 . 6  2  3 ( -1  ) 

-0.561 ( -1 ) 

— 0 . 1 4  7  ( —1 ) 

Y24 

0 . 683( -1 ) 

0 . 7 1  5  (  —1 ) 

0 . 660( -1 ) 

T34 

-0 . 604 ( -2) 

-0.541 ( -2) 

—0 . 1 85 ( -1 ) 

s 

0.129 

0.117 

0 . 647 ( -1  ) 
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replication  2),  (1,3,4),  (1,4,1),  (2,2,1),  (2,2,4),  (2,4,2),  and 
I  (2,4,4)  indicate  that  the  joint  structure  of  the  two-way  layout  with 

interaction  model  and  independent,  identically  distributed  Gaussian 

I 

errors  is  not  consistent.  As  with  the  other  examples,  this  data  will 
'  be  examined  further  in  Part  6. 

2.4  Model-Critical  Selection 

Until  now  it  has  hpen  assumed  that  the  true  model  for  the  data  is 
known.  In  practice,  the  model  is  selected  from  a  number  of  competing 
models.  A  number  of  procedures  exist  for  selecting  a  model;  two  widely 
used  selection  criteria  are  the  Akaike  information  criterion  (AIC) 
(Akaike,  1974)  and  Mallows1  statistic  (Mallows,  '•■’3).  Although 
not  restricted  to  a  particular  class  of  models,  the  AIC  has  been  used 
primarily  in  time  series  modeling.  The  Cp  statistic  has  been  widely 
used  in  linear  regression  modeling.  Other  model  selection  procedures 
are  discussed  in  Draper  and  Smith  (1966),  and  Daniel  and  Wood  (1980). 

Using  the  generalized  likelihood,  we  derive  a  model -critical 
selection  criterion  which  is  the  mode  1 -c r i t i ca 1  analogue  to  the  AIC. 

It  is  assumed  that  all  the  necessary  regularity  conditions  are 
satisfied.  The  following  derivation  is  for  univariate  data;  however, 
there  is  nothing  in  the  derivation  which  restricts  the  criterion  to 
univariate  data.  Let  Lc ( ©)  denote  the  generalized  likelihood 


evaluated  at  e,  where  e  is  a  p-dimensional  parameter  vector.  Let 


denote  the  value  of  ©  which  maximizes  L  (ft).  Let 

c 


a  Lc(e) 


39i  e=e 


3Lc(0)  3Lc(6) 

~^T^T  e-e 


where  ftQ  denotes  the  true  parameter  vector. 


If  the  correct  model  is  used  in  Lc(ft),  thenvn(©  -  ©Q)  is 
asymptotically  normally  distributed  with  zero  mean  and  covariance 
matrix  H”1  Vq  H’1  (see  Delehanty,  1983). 


Approximating  Lc(©q)  as  a  truncated  Taylor  series  about 


yields 


Lc'V  -  Lc'g)  *  2  <9  -  9/  5571i~  (§  -  e0> 


since  3Lc(©)/3©  =  0.  Equation  (4.3)  can  be  rearranged  to  obtain 


A  A  T  A  A 

2 ( L  (ft)  -  L  (ft  ))=(©-©  )  H  (ft  -  ft  ) 

'  C  C  O  0  0 


V.WV.V  V.V.V.V.V, V,  A  <T.  AA  C, 


*  »  -  <  fc 


a» 


where  H  is  (4.1)  evaluated  at  e.  Delehanty  (1983)  has  shown  that 

2(L  (&)  -  L  (e  )  )  has  the  same  asymptotic  distribution  as 
c  c  c 

T  1/2  -1  1/2 

z  V  H  VA  z,  where  z  is  a  multivariate  normal  vector  with  zero 
o  o  0 

mean  and  covariance  matrix  equal  to  the  identity  matrix.  For  Gaussian 
data. 


-3/2  -1 

H0  =  A( 1  +  c)  C 


(4.5a) 


2  -3/2  -1 

V  =  A  ll  +  2c)  C 
o 


(4.5b) 


where  C  depends  on  the  underlying  model  and 


c/2( 1  +  c) 


(4.6) 


From  the  above  discussion,  it  follows  that 


2(Lc(e)  -  Lc(eo))/(A((l  +  c)/(i  +  2c ) )3/2) 


(4.7a) 


A  T  A  a  3/2 

(0-eo)'  h  (e-e0)/(A((l  +  0/(1  I-  2c))J/0 


(4.7b) 
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has  an  asymptotic  chi-square  distribution  with  p  -  pQ  degrees  of 

freedom,  where  p  is  the  dimension  of  e  .  In  fact,  (4.7b)  is 
o  o 

asymptotically  the  sum  of  pq  independent  chi-square  random  variables, 

each  with  one  degree  of  freedom.  Since  (l/n)H  converges  to  Hq  in 
A]  /2  A 

probability,  v  =  H  (e  -  ©0)  is  asymptotically  distributed  the  same  as 
1/?  3/4 

A  ((1  +•  c)/(l  +  2c))  z  where  the  entries  of  z  are  independent 
normal  random  variables  with  zero  mean  and  unit  variance.  That  is,  the 
entries  of  t>  are  asymptotical  ly  standard  normal  random  variables.  Since 
e  converges  to  Gq  in  probability  at  the  true  model,  we  submit  without 
proof  that  the  entries  of  H^2(0  -  &Q)/\ln  obey  the  law  of  the 
iterated  logarithm  (Heyde,  1973).  That  is,  for  large  n. 


*1  "  A 


1/2 


0&)3/4  *,<-> 


/2 


(4.8) 


where  -1  <  6.(n)  <  1.  Squaring  and  summing  tha  entries  u.  yields 


«T»  ■  E-?  -  *  (Hk)3/2 

i  =  l 


(4.9) 


where 


T 

4(n)  =  ^(n 

i  =  l 


)2/p 


$ 


* 


S, 

I 

’‘v 

r 

C  , 

•vw 

3 


« •»  i 

sa 
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Using  (4.4),  we  obtain 


-  (L  (e)  -  l 
n  cv  c 


(eo»  -  *  (j^T  •(»)  (“4 


(4.10) 


where  o  <  6(n)  <  1  and  p  =  pQ.  Let  Sp  and  be  parameter  spaces 

with  dimensions  p  and  p+q,  respectively,  and  let  Sp  be  a  subset  of 

S  .  If  §  and  &  are  the  values  of  6  which  maximize  L  (&)  over 
p+q  p  p+q  c  ' 

S  and  S  ,  respectively,  then  L  (&  )  <  L  (4  )  and 

p  p+q  c  p'  -  c  p+q' 


-2(L  (&  )  -  L  (0  ))  >  -2(L  (£  )  -  L  (9  ))  . 

cv  p'  c  o  -  vc'  p+q'  cK  o' ' 


(4.11) 


from  (4.11),  we  have  that  -2(Lc(0)  -  1^(0^))  is  non-increasing  as  the 
parameter  space  increases.  It  is  noted  that  (4.11)  is  true  when  © 

o 

is  in  both  Sp  and  Sp+^;  the  reduction  in  -2Lc(^)  being  due  to  the 
effect  of  estimating  additional  parameters. 


Eliminating  6 ( n )  from  (4.10)  and  rearranging  yields 


1(1  (0)  -  L  (0  ))  +  A 
n  c  c  o 


(l  Izc  ) 


(4.12) 


for  p  >  pQ.  From  the  above  discussion,  a  model  selection  criterion 
is  to  select  the  model  which  minimizes  the  left  side  of  (4.12).  The 
term  1^(0^)  can  be  ignored  since  it  contains  only  the  true 


parameters;  however,  the  term  A  involves  the  true,  but  unknown, 
2  „  . 

variance  aQ.  Using 


-c<$>  =  c  2 


1  -t-c  \  c/2(  1 +c ) 


[-  °  (Xt  -  ht 


(0))2/2^2  -if  , 


(4.13) 


2(LC(6)  -  LC(0O)) 


-  J  Z  N 

4-  _  1  *- 


-c/2(Uc) 


exp(-c(xt  -  ht(^))2/2^2) 


-  exp  (-  c(xt  -  ht(eQ))  /2aJ 


(4.14) 


Approximating 


(»2/»o)  C/2(UC’»  1  -  *  ^((^)log(52/o^))2 


for  small  c  and  substituting  into  (4.14)  yields 


2(LC(9)  -  Lc(e0)) 


c  (wt  '  "t}  f  fFTo  (log  ° 
t=i  L 


a2  .  2 

a  /o„ 
0 


,a2  .  2.x  2 
A  /l0^°  /o0^ 
CWt  \  2(l+c)  /  . 


(4.15) 


'“AHV'WCTH.1  WTO  Bt  1*  TCTCTTTCrg  W  WWWiT^^'WWB'..VI  rvc.v  WJfJirjorjoi  ar.  *--  ir« 


W 

vl 

t**J 

tjs 

iV 


where 


wt  =  exp( -  c(xt-ht(©o))2/2c^) 


wt  =  exp( -  c(xt  -  ht(e))2/2o2) 


For  small  c  and  large  n,  the  first  and  third  terms  of  (4.15)  will  be 
small;  thus,  -2(l_c(e)  -  Lc(eQ))/A  can  be  approximated  by 


(1/(1  +  c))  log(a2/o2)  wt  • 


A  _i  /2 

At  the  true  model  converges  to  (1+c)  for  Gaussian  data. 
From  the  above  discussion,  it  follows  that 


nlog(o2/o2) 

2(Lc(e)  •  Lc(e0))/A  «  — - - ,  t(c.p) 

(1  +  C) 


(4.16) 


where  t(c,p)  contains  terms  involving  powers  of  c  and  the  number  of 
parameters  p.  Using  (4.12)  and  (4.16),  a  model -critical  selection 
criterion  is  to  select  the  model  which  minimizes 


PS  I C ( p , c )  =  log  a  (c)  +  2(s(c)p/n) loglogn 


(4.17) 


WTWPW 
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2  3/2 

where  s(c)  >  [(1  +  c)  /(I  +  2c)]  .  The  restriction  on  s(c)  is 

required  to  offset  the  effect  of  t(c,p)  in  (4.16).  The  form  of  (4.17) 
is  a  generalization  of  the  Hannan-Quinn  (1979)  procedure  and  reduces  to 
their  procedure  for  c  =  0. 


2 . 5  Monte  Carlo  Analysis  of  PSIC 

In  the  above  discussion,  it  was  assumed  that  the  data  were 

Gaussian  with  no  contamination  such  as  outliers  and  that  the  data  were 

processed  using  a  small  value  of  c  which  reflects  the  amount  of 

criticism.  A  Monte  Carlo  study  was  performed  to  evaluate  the  PSIC 

criterion  and  to  examine  the  sensitivity  to  the  assumptions.  The 

analysis  examined  autoregressi ve  models  of  order  p  =  1,  2,  4,  and  8 

with  innovations  from  the  Gaussian,  Gamma,  uniform  and  t  distributions; 

for  p  =  8  only  Gaussian  innovations  were  considered.  For  the  critical 

parameter  c,  the  values  used  in  the  estimation  and  selection  procedure 

were  l  =  0.1,  0.2,  0.3,  and  0.4.  The  model  selection  procedure  was 

also  applied  to  a  Gaussian  AR(p)  process  with  four  additive  outliers 

for  p  =  1,  2,  and  4.  A  Gaussian  AR(8)  process  with  eight  additive 

outliers  was  also  examined.  The  locat.-rs  of  the  outliers  in  the  time 

series  were  randomly  selected  using  the  uniform  distribution.  The 

outliers  had  a  Gaussian  distribution  with  zero  mean  and  variance 
2 

a  .  The  model  order  was  selected  for  1000  realizations  of  an  . 
autoregress i ve  process  with  sample  sizes  of  50,  100  or  200.  For 


the  Hannan-Quinn  procedure,  denoted  H-Q,  a  value  of  s(0)  =  1  was  used 

and  for  the  model -critical  procedure  PSIC(p.c),  c  >  0,  the  values  of 

2 

s(c)  =  1  and  s(c)  =  (1  +  c)  were  used.  For  an  AR ( 4 )  process,  order 
selection  results  are  presented  in  Tables  2.10  to  2.20.  The 
experimental  results  indicate  that  the  selection  procedures  are  not 
sensitive  to  the  innovations  distribution.  The  H-Q  and  PSIC  selection 
criteria  performed  better  than  the  AIC  when  no  outliers  are  present. 
However,  when  additive  outliers  are  present,  the  PSIC  selection 
procedure  improves  as  c  increases  as  a  result  of  the  outliers  being 
downweighted.  Since  maximum  likelihood  estimation  weights  all  data 
equally,  a  higher  order  model  is  required  to  fit  the  AR ( p )  process  with 
outliers.  The  downweighting  of  outliers  in  model-critical  estimation 
results  in  obtaining  the  best  AR ( p )  model  which  fits  the  bulk  of  the 
data.  The  results  for  A R ( 1 )  and  AR(2)  processes  are  similar  to  the 
AR(4)  results  shown  in  the  tables.  The  effect  of  s(c)  is  seen  in  the 
results  of  PSIC  1  and  PSIC  2.  Using  s(c)  =  1,  PSIC  1  is  not  as 
effective  as  H-Q;  for  s(c)  =  (1  +  c)2,  PSIC  2  outperforms  H-Q.  When 

the  data  contain  outliers,  both  PSIC  1  and  PSIC  2  become  more  effective 

2 

as  c  increases.  Our  experiments  have  shown  that  s(c)  -  (1  f  c) 
produces  the  largest  s(c)  which  results  in  few  selections  of  underfit 
models.  A  more  conservative  s(c)  is  (1  +  c)  which  will  yield  results 
between  that  of  PSIC  1  and  PSIC  2. 


TABLE  2.10 

Frequencies  of  the  Order  Selected  for  the  Model 
=  2.0625xt_1  -  2.4325xt_2  +-  1  . 5845xt  3  -  0.652xt_4 

with  et  Distributed  Normal(O.l)  and  Sample  Size  50 
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TABLE  2.11 


Frequencies  of  the  Order  Selected  for  the  Model 
x^.  =  2.0625x^._1  -  2.4325xt_2  +  1.5845x^._3  -  0.652x.(._4  +■  e^  i 

with  e^  Distributed  Normal(O.l)  and  Sample  Size  100  j 

I 


<  2 

3 

4 

5 

6 

7 

8 

9 

>  10 

AIC 

0 

0 

544 

139 

77 

63 

57 

47 

73 

H-Q 

0 

0 

766 

109 

57 

28 

12 

14 

14 

PSIC  1 

0 

0 

728 

124 

57 

36 

19 

17 

19 

C  =  0.1 

PSIC  2 

0 

0 

803 

107 

50 

21 

6 

9 

4 

PSIC  1 

0 

0 

677 

130 

71 

43 

27 

23 

29 

C  =  0.2 

PSIC  2 

0 

n 

w 

842 

92 

37 

17 

5 

4 

3 

PSIC  1 

0 

0 

592 

149 

88 

47 

40 

33 

51 

C  =  0.3 

PSIC  2 

0 

0 

879 

72 

26 

12 

5 

6 

0 

PSIC  1 

0 

0 

511 

153 

95 

63 

58 

49 

71 

C  =  0.4 

PSIC  2 

0 

0 

883 

63 

24 

13 

4 

9 

4 

* 

I 

l 


< 

4 


1 

1 


TABLE  2.12 

Frequencies  of  the  Order  Selected  for  the  Model 
=  2.0625xt_1  -  2.4325xt_2  +■  1.5845xt_3  -  0.652xt_4  +• 

with  et  Distributed  Normal(O.l)  and  Sample  Size  200 
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TABLE  2.13 


Frequencies 

of  the  Order  Selected 

for  the 

Model 

*.  =  2 

.  0625x 

t-1 

Vt  =  xt  +  vt 
2.4325xt_2  ♦ 

Where 

1  .5845xt  - 

-  0. 

625S-4 

+  6t 

with  et 

Distributed 

Normal (0,1 ) , 

vt 

Distributed 

Normal (0,4) , 

4  Values 

of  vt 

; *■  0,  and  Sample  Size  50 

<  2 

3 

4 

5 

6 

7 

8 

9 

>  10 

A IC 

39 

25 

1 79 

285 

123 

81 

83 

50 

135 

H-Q 

70 

35 

255 

315 

111 

60 

61 

26 

67 

PSIC 

1 

52 

36 

261 

297 

108 

66 

65 

30 

85 

C  = 

0.1 

PSIC 

2 

81 

43 

301 

318 

91 

47 

47 

17 

55 

PSIC 

1 

42 

31 

248 

252 

102 

66 

74 

55 

130 

C  = 

0.2 

PSIC 

2 

85 

44 

339 

278 

92 

40 

37 

27 

58 

PSIC 

1 

34 

21 

220 

198 

97 

69 

81 

94 

186 

C  = 

0.3 

PSIC 

2 

87 

44 

372 

219 

77 

39 

41 

41 

80 

PSIC 

1 

26 

18 

172 

134 

87 

60 

81 

121 

301 

C  = 

0.4 

PSIC 

2 

93 

50 

349 

167 

71 

27 

47 

55 

141 

I 
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TABLE  2.14 


Frequencies  of  the  Order  Selected  for  the  Model 
y-{-  =  +  V£  Where 

x  =  2.0625xt_1  -  2.4325xt_2  +  1.5845xt_3  -  0.625xt_4  +  et 
with  e^  Distributed  Norma  1 (0 , 1 ) ,  Distributed  Normal(0,4), 
4  Values  of  v-j-  *  0,  and  Sample  Size  100 


1  ABLE  2.15 


Frequencies  of  the  Order  Selected  for  the  Model 
yt  =  xt  +  vt  Where 

x  =  2.0625xt_1  -  2.4325xt_2  +  1.5845xt_3  -  0.625xt_4  *•  et 
with  e^  Distributed  Normal (0 , 1 ) ,  Distributed  Normal (0,4) , 
4  Values  of  *  0,  and  Sample  Size  200 


TABLE  2.17 

Frequencies  of  the  Order  Selected  for  the  Model 
xt  =  2.0625xt_1  -  2.4325xt_2  +  1.5845xt_3  -  0.625xt_4  +  e 

with  et  Distributed  Gamma  (3,1),  and  Sample  Size  100 
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TABLE  2.18 

Frequencies  of  the  Order  Selected  for  the  Model 
xt  =  2.0625xt_1  -  2.4325xt_2  +  1.5845x  g  -  0.625xt_4  +  e  i 

with  e  Distributed  Gamma  (1.25,1),  and  Sample  Size  100 

I 

- - - _  I 


TABLE  2.19 
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Frequencies  of  the  Order  Selected  for  the  Model 


=  2.0625xt_1 


with  e.  Distributed  t(10),  and  Sample  Size  100 


55 


2.4325xt_2  4-  1.5845xt_3  -  0.625xt_4 


+■  e . 
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TABLE  2.20 

Frequencies  of  the  Order  Selected  for  the  Model 
x^  =  2.0625x^,_^  -  2.4325x^._2  +■  1.5845x^_2  -  0.625x^._^  +  e^ 

with  et  Distributed  t(3),  and  Sample  Size  100 


<  2 

3 

4 

5 

6 

7 

8 

9 

>  10 

AIC 

0 

0 

592 

121 

74 

56 

47 

47 

63 

H-Q 

0 

0 

753 

100 

47 

36 

23 

20 

21 

PSIC 

1 

0 

0 

658 

131 

64 

60 

33 

21 

33 

C  =  0.1 

PSIC 

2 

0 

0 

748 

113 

46 

42 

19 

14 

18 

PSIC 

1 

0 

0 

609 

136 

83 

58 

46 

31 

37 

C  =  0.2 

PSIC 

2 

0 

0 

757 

125 

42 

32 

19 

9 

16 

PSIC 

1 

0 

0 

517 

149 

90 

78 

63 

38 

65 

C  =  0.3 

PSIC 

2 

0 

0 

772 

122 

47 

20 

17 

6 

16 

PSIC 

1 

0 

0 

432 

148 

90 

85 

78 

69 

98 

C  =  0.4 

PSIC 

2 

0 

0 

787 

107 

42 

22 

18 

9 

15 
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The  PSIC  and  H-Q  model  selection  procedures  are  affected  by  the 


location  of  the  roots  of  the  characteristic  equation: 


*h  z~h  ■  0 


For  small  to  moderate  sample  sizes,  these  procedures  tend  to 


underest imate  the  model  order  if  the  roots  of  (5.1)  are  "well"  within 


the  unit  circle;  as  the  roots  of  (5.1)  get  closer  to  the  unit  circle. 


the  procedures  tend  to  estimate  the  true  order.  The  effect  of  the 


location  of  the  zeroes  of  (5.1)  on  the  model  selection  process  was 


examined  by  selecting  the  zeroes  to  be  uniformly  distributed  over  some 


annulus  or  group  of  annuli  inside  the  unit  circle.  The  strategy  of 


drawing  models  in  this  way  allows  us  to  obtain  results  for  a  multitude 


of  models  within  a  certain  class  as  opposed  to  repeatedly  drawing 


realizations  from  the  same  model.  For  example,  one  could  determine  N 


distinct  pth-order  models  by  drawing  N  p-triplets  from  inside  the  unit 


circle  izi  =  1  in  order  to  compare  one  procedure  such  as  H-Q  against 


PSIC.  We  report  in  detail  on  one  annulus  examined  closely,  namely 


0.875  <  izi  <  0.99,  where  z  is  a  root  of  (5.1).  Simulations  were 


performed  for  AR ( 8 )  processes.  The  selection  results  for  1000 


realizations  of  Gaussian  AR ( 8 )  processes  are  presented  in  Table  2.21 


and  2.22.  For  N  =  200,  the  AR(8)  process  was  examined  both  with  and 


without  outliers.  These  results  are  similar  to  the  case  of  a  fixed 


mode  1 . 


SO 
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1  ABLE  2.21 


Frequencies  of  the  Order  Selected  For  Gaussian  AR(8)  Models 
Where  the  Roots,  z,  of  the  Characteri sti c  Equation  Satisfy 
0.875  <  izi  <  0.99,  Sample  Size  200,  and  c  =  0.3 


<  5 

6 

7 

8 

9 

10 

11 

12 

>  13 

AIC 

0 

0 

0 

444 

132 

68 

73 

97 

186 

H-Q 

0 

0 

0 

621 

94 

57 

54 

64 

no 

PSIC 

1 

0 

0 

1 

524 

136 

61 

69 

80 

129 

PSIC 

2 

1 

1 

0 

715 

74 

31 

44 

68 

66 

TABLE  2.22 

Frequencies 

of 

the 

Order  Selected  For  Gaussian 

A R ( 8 )  Models 

with  8 

Additive 

Outliers  Where  the  Roots,  z 

,  of  the 

Characteristic 

Equation 

Satisfy  0.875 

<  1  Z  1 

<  0.99, 

Sample  Size 

200,  and 

c  =  0 

.3 

<  5 

6 

7 

8 

9 

10 

1 1 

12 

>  13 

AIC 

4 

4 

4 

75 

131 

210 

191 

185 

196 

H-Q 

23 

12 

22 

119 

204 

246 

164 

1  24 

86 

PSIC 

1 

13 

16 

21 

153 

220 

181 

135 

1 1  5 

145 

PSIC 

2 

36 

33 

40 

260 

269 

1  /  3 

81 

59 

49 

3 
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2 . 6  Discussion 

The  derivation  of  (4.17)  assumed  univariate  data.  The  PSIC 
criterion  can  be  extended  to  r-variate  models  by 

PSIC(p.c)  =  log | D( c ) )  +  (2ps(c)/n) loglogn  (6.1) 

where  D(c)  is  the  model-critical  estimate  of  the  error 
variance-covariance,  p  is  the  number  of  parameters  in  the  model,  and  s(c) 
>  [(1  +  c)  2/(  1  +■  2c)  '  sinCe  a  process,  especially 

multi  variate,  can  have  more  than  one  representation,  constraints  on  the 
structure  of  the  model  are,  in  general,  required  in  order  to  effectively 
use  a  selection  criteria  such  as  (6.1).  Model  constraints  are  a  major 
issue  in  multivariate  model  identification  and  are  not  discussed  here 
(see  Denham  1974;  Dunsmuir  and  Hannan  1976;  and  Hannan  1969,  1981,  and 
1982) . 

As  an  asymptotic  result,  the  law  of  the  iterated  logarithm  requires 
a  large  sample  size.  For  small  sample  sizes,  the  term  loglogn  in  (4.17) 
and  (6.1)  could  be  eliminated;  the  result  is  an  Akaike-like  selection 
procedure  which  is  less  sensitive  to  underestimating  the  model  order.  If 
the  model  order  is  underestimated,  the  parameter  estimates  for  the 
resulting  model  will  be  biased  (see  Draper  and  Smith,  1966).  This  may 
have  serious  consequences  if  the  model  is  to  be  used  for  prediction. 


If  the  model  order  is  overestimated  too  many  parameters  will  be 
estimated  and  some  efficiency  will  be  lost.  For  prediction, 
overestimating  the  model  order  is  not  as  serious  as  underestimating 
the  model  order.  Also,  goodness  of  fit  tests  will  be  affected  more 
seriously  by  underestimating  the  model  order  than  by  overestimating 


the  model  order. 


PARI  3 


MODEL-CRITICAL  ESTIMATION  FOR  UNIVARIATE  ARMA(p,q)  MODELS 


3 . 1  Introduction 

In  Part  2,  procedures  to  obtain  model-critical  parameter  estimates 
for  autoregressive  (AR)  processes  were  presented.  Model-critical 
estimation  procedures  for  autogregressi ve-moving  average  (ARMA)  models 
are  presented  in  this  part.  As  in  autoregressi ve  models,  outliers  or 
model  anomalies  can  be  masked  by  the  structure  of  the  ARMA  model.  The 
AR ( 4 )  example  in  Part  2  illustrates  that  plots  of  the  data  may  not 
illuminate  model  difficulties.  The  autocorrelations  which  are  used  to 
obtain  the  parameter  estimates  tend  to  accommodate  outliers  or  model 
anomalies  (Barnett  and  Lewis,  1978).  Unlike  the  independent 
observation  case  where  an  outlier  usually  does  not  affect  other 
observations,  an  outlier  in  an  ARMA  process  may  or  may  not  affect  other 
(usually  subsequent)  observations  depending  on  the  mechanism  which 
produced  the  outlier. 


Harvey  and  Phillips  (1979)  and  Jones  (1980)  have  presented  Kalman 
filter  algorithms  to  calculate  the  exact  likelihood  of  a  Gaussian  ARMA 
process.  The  recursive  procedure  of  the  Kalman  filter  makes  it  ideal 
for  model -critical  analysis,  since  data  inconsistent  with  the  model  can 
be  downweighted  during  the  estimation  of  the  model  parameters.  Since 
the  usual  log  likelihood  function  is  a  special  case  of  the  generalized 
likelihood  function,  virtually  the  same  computer  program  can  be  used  to 
calculate  the  log  likelihood  and  generalized  likelihood  functions. 


Nonlinear  optimization  programs  can  be  used  to  maximize  the  likelihood 
functions  and  thereby  obtain  the  maximum  likelihood  and  model -critical 
parameter  estimates. 

3.2  Generalized  Likelihood  for  ARMA  Models 

Let  x,,  x„,  ....  x  be  a  realization  of  a  zero  mean  ARMA(p,q) 

1  2  n 

process  with  representation 


P  q 

xt  =  2  Vt-k  +  Et  +  2  Rkct-k  ’ 

k=l  k=l 

2 

where  is  distributed  N(0,  a  )  and  Etc^]  =  0  for  t  *  s.  For 
stationarity  and  invertabi 1 ity,  the  roots  of  the  characteristic 
equations 


and 


=  0 


1 


=  0 


are  assumed  to  lie  outside  the  unit  circle,  |z|  =  1.  Using  the 

2 

multiplication  rule,  the  log  likelihood  L  (a,  (3,  a  )  can  be 


expressed  as 


(2.3) 


n 

L0(«.  (3,  a2)  log  <*.  <3.  </) 

n=l 

where 


“  (a1  >  a2  > • • • •  ap)  • 

13 T  =  (Br  fl2 .  (3q)  , 


xt-l  ^X1  ’  x2 .  xt-l  ^  ’ 


f  1  ( XT  |  x0,  <x t  3 »  o  )  t i  ( x 1 1  ex t  (3 1  o  ) 


-1/2 


ft(xt|7t_r  a,  0,  a2)  =  (  2ir  a\)  exp  -( Xt -X  ( 1 1 1-1  )  )2/2o^"|  .  (2.4) 


x( t I t-1 )  =  E[xt|xtl,  a,  0,  a]  , 


(2.5) 


and 


2 


Var[x 


t 1  t-1 


0, 


a2] 


(2.6) 


Using  equations  (2.4)  to  (2.6),  the  log  likelihood  of  equation  (2.3) 
can  be  written  as 


n  n 

L0(«.  B.  a2)  -  1  og  ( 2*)  -  \  ^2  log  °t  “  2  S(Xt  '  x(t|t'1  ))2/ot  • 

t=l  t=l 


(2.7) 


The  model  critical  parameter  estimates  are  a(c),  b(c)  which  are  vectors 

of  the  autoregressi ve  and  moving  average  parameters,  respectively,  and 
2 

s  (c)  which  maximize  the  generalized  likelihood  function 


\  f x+_i.  «.  B.  «2) 


.  ,  n  2 .  1  'tVAt  At-1*  “* 

Lc(a,  0,  o  )  =  -  >  i 2 

B,  a  .  c 


(2.1?) 


where 


f t ( xt *  a>  B,  a  )  is  defined  by  (2.4)  , 


„  ,  „  2  w.  2.  c  ,-l  /2 

Q^.(a,  0,  a  ,  c)  =  [(1  +  C)(2iro^.)  ] 


(2.9) 


a  =  c / ( 1  +  c),  and  c  is  the  model -critical  parameter.  The  function 
2 

Qt(a,  13 ,  o  ,  c )  is  the  information  generating  function  of  the  density 

(2.4)  and  defined  as  in  (2.2.2).  In  the  next  section,  it  will  be  shown 

2 

that,  given  a  and  (3,  expressions  for  x  ( 1 1 1 -1 )  and  can  be  obtained; 

2  2  2 
with  x  ( 1 1 1 -T  )  and  available,  L  (a,  (3,  a  )  and  lc(a,  0,  a  )  can  be 

evaluated.  Unlike  the  case  of  independent  observations, 

~  2  2 

f^(  x^|  x  i  ,  a,  0,  a  )  and  Q^(a,  0,  a  ,  c)  depend  on  the  observation  t. 


For  the  remainder  of  this  part,  f  and  Q  will  denote 

~  2  2 
ft^xtlxt-l’  a-  0  )  and  Qt(a-  0.  a  ,  c),  respectively. 

2 

The  generalized  likelihood  Lc(<*,  [3,  o  )  without  the  constant  term 

(-n/c)  will  be  denoted  by  1(c).  Ihe  model -crit ical  estimate  for 
T  T  T  2 

9  =  (a  ,  0  ,  a  )  is  the  solution  e(c)  to 


'  ---  ■  ■  .  ■  .•>  --  .v  •-  . 
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t=l  ut  L 

The  following  discussion  applies  to  the  selected  ARMA(p.q)  model. 

As  in  the  independent  sample  case,  each  term  in  (2.10)  is  weighted  by 
c  <3 

f t /Q  .  For  and  defined  above, 


ft/Qt  =  ( ( 1  +-C ) /2tt<3^) a/2  exp 


-  c ( x .  -  x ( t | t-1 ) ) 2/2 al 


(2.11 


2  2 

Data  corresponding  to  large  (x^  -  x(t|t-l))  /o^  will  be  downweighted 

in  an  adaptive  manner  as  the  estimation  process  iterates  to  find  a 

solution;  the  degree  of  downweighting  being  determined  by  the  value  of 

c.  Model -critical  estimation  filters  out  non-Gaussian  influences  and 

finds  the  "best"  APMA(p,q)  model  consistent  with  Gaussiani+y.  Data 

inconsistent  with  the  Gaussian  ARMA( p , q )  model  will  receive  small 

weights  since  (x  -  x(t|t-l))  will  be  large.  As  the  value  of  c  >  0 

s  i  nt.  reased ,  the  procedure  is  more  critical  of  the  joint  character  of 

the  data  and  the  assumed  model.  If  the  data  and  the  Gaussian 

ARMA(p,q)  model  are  internally  consistent,  then  the  mode  1 -c ri t i ca 1 

2 

estimates  a(c),  b(c),  and  s  (c)  and  the  maximum  likelihood  estimates 
a(o),  b(o),  and  s^(o)  will  be  approx imate 1 y  equal.  Our  experiments 
have  shown  that,  if  the  model  is  ARMA(p.q)  and  the  innovations  are 


symmetric  heavy  tailed  non  -Gauss i an ,  a ( c )  and  b(c)  will  not  differ 


much  from  a(o)  and  b(o)  as  c  increases.  However,  s  (c)  will  differ 

2 

considerably  from  s  (o)  as  c  increases.  These  facts  provide  a 

measure  as  to  the  consistency  between  the  data  and  the  model.  If  p 

5  2 

and  q  are  known,  the  estimates  s'(c)  and  s  (o)  can  be  used  to 
examine  the  innovations  structure;  this  will  be  discussed  further  in 
Part  5. 

3 . 3  Evaluation  of  the  Generalized  Likelihood 

In  the  previous  section,  expressions  were  obtained  for  the  log 
likelihood  (2.7)  and  the  generalized  likelihood  (2.B).  For  ARMA 
models,  it  is  not  practical  to  obtain  the  parameter  estimates  by 
solving  the  system  of  equations  defined  by  (2.10)  since  the  equations 
are  nonlinear  and  complicated,  especially  the  equations  for 

2 

mode  1 -cri tical  estimates.  It  will  be  shown  that  given  a,  b  and  s 

2 

the  values  of  x(t|t-l)  and  can  be  calculated  for  t  =  1,2 . n; 

hence,  (2.7)  and  (2.8)  can  be  evaluated.  Being  able  to  evaluate  (2.7) 

2 

and  (2.8),  they  can  be  maximized  by  searching  over  a,  b,  and  s  via 

nonlinear  optimization  methods.  A  Kaltnan  filter  will  be  used  to  obtain 
2 

x ( 1 1 1  - 1  )  and  recursively.  Since  the  Kalman  filter  processes  each 

observation  individually,  it  is  ideal  for  model -critical  estimation 

because  it  allows  for  data  inconsistent  with  the  model  to  be 

downweighted  during  the  estimation  process.  The  maximum  likelihood 

2 

estimates  a(o),  b(o),  and  s  (o)  and  the  mode  1 -c ri t i ca 1  estimates 
2 

a ( c )  ,  b(r),  and  s  (c)  ran  be  examined  to  provide  insight  into  the 
adequacy  of  the  model.  The  following  is  a  brief  discussion  of  the 


Kalman  filter  algorithm  which  can  be  found  in  a  number  of  references 
(see  Kalman,  1960  and  Gelb,  1974). 


Let  the  observation  y  be  given  by  the  measurement  equation 


yt  =  Z 


wt  + 


(3.1 


where  y  is  the  observed  value,  z  is  a  k  x  1  vector  of  fixed  known 
values,  is  the  k  x  1  state  vector  of  the  system,  and  is  the 
measurement  error.  The  measurement  error,  u^,  is  usually  assumed  to 
be  zero  mean  Gaussian  with  variance  R.  The  state  equation  is  given  by 


-  Aw  i  +■  Be^  (3.2 

where  A  is  a  k  x  k  transition  matrix  of  known  values,  B  is  a  k  x  m 

matrix  of  known  values,  and  e^  is  a  m  x  1  vector  of  normally 

T  2 

distributed  random  variables  with  E [ e ^ ]  =  0,  E [ e ^.e^ ]  -  a  0  and 
E [ et J  =■  0  for  all  t  /  s.  The  known  matrix  0  is  assumed  to  be 
positive  definite.  Further,  E [ u^  )  =  0,  E [ u  ^  ]  -  R,  t[u^u$]  =  0 
for  t  *  s ,  and  E  [  u^.  e  ^  ]  =  0  for  all  t  and  s.  Given  the  measurements 
y-|  .  y  2>  yt-l  ’  w(t-l|t-l)  denote  the  minimum  mean  square 

estimate  of  Let  a  P(t-l|t-l)  denote  the  estimation  error 

covariance  matrix,  where  P(t-ljt-l)  is  known.  That  is, 


E[(wt_1  -  w(t-llt-l))  (wt1  -  w(t-l  1 1-1 ) )  ]  =  </P(t-l|t-l). 

Let  w(t|t-l)  and  P( t I t-1 )  denote  the  predicted  values  of  and  P 
given  y^ ,  y^,  ....  y  These  quantities  are  given  by  the 
prediction  equations 

w( 1 1 1 — 1 )  =  Aw( t-1 1 1-1 ) 


and 


P(t|t-1)  =  AP(t-l | t-1 )AT  +  BQBT 

Using  the  tth  observation,  y^,  w(tjt)  and  P(t|t)  are  obtained  by 
the  update  equations 


w(t|t)  =  w(t|t-i )  +  P(t|t-l)zb‘1(yt_1  -  w( t 1 t-1 ) ) 


and 


P( t It)  =  P( 1 1 1-1 )  -  P(t|t-l)zbt1zTP(t|t-l ) 


T 

z 


P(t|t-l)z  +  R. 


(3.7) 


where 


rjt  rjiivL  Fjrnx*jr«  vxvx  vv  y*  \.vwtvw..vi  :v\v\  v 


V  \x  vwwumi  L’T.  vf  Tr^jnurfj?  r_*  v  r^.  r : 


hi 


3 


The  ARMA(p,q)  model  in  equation  (2.1)  can  be  written  as 


r  r 

‘t  Vt-k  +  'i  *Y1 Bktt- 


where  r  =  max(p,q+l ) 


Alternatively,  (3.8)  can  be  expressed  in  Markovian  form  by 


1  0  ...  0 


0  1  ...  0 


0  ...  0  wt-1  +  B2  , 


(3.8) 


(3.9) 


°V-1  0 


ar  0  0 


where  the  first  element  of  w^  is  x  .  Equation  (3.9)  is  the  state 
equation  (3.2)  in  the  state  space  formulation  of  the  ARMA(p,q)  model 
The  measurement  equation  corresponding  to  (3.1)  is 


xt  =  z  wt 


(3.10) 


z  =  (1.  0.  0 . Or). 


where 
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For  ARMA(p,  q)  models  without  measurement  error,  R  =  0,  0=1, 


1  0  0 


0  1  0 


(3.11) 


0  0  0 
0  0  0 


(3.12) 


To  start  the  recursions  in  equations  (3.3)  to  (3.7),  initial  values  are 
needed  for  w(0|0)  and  P(0|0).  When  no  observations  are  available,  the 
minimum  mean  square  estimate  of  wq  is  zero;  therefore,  w(0|0)  is  set 
equal  to  zero.  Since  P( 0 | 0)  =  E[wowJ],  the  covariance  matrix  of 
the  state  vector,  the  value  of  P(0I0)  is  the  solution  P  of  the'equation 


P  =  APA  +-  8Q8  . 


(3.13) 


Equation  (3.13)  follows  using  equation  (3.2)  and  noting  the 
stationarity  assumptions  on  the  process  (see  Kalman,  1960).  Using  the 
fact  that  APAT  =  ( A  (x)  A) vec( P)  (Neudecker,  1969),  (3.13)  can  be 
rewritten  as 


[I  -  A@A]vec(P)  =  vec(BB  ) 


(3.14) 


where  (x)  indicates  the  Kronecker  product.  [I  -  A®  A],  vec(P)  and 
vec(BB^)  are  defined  by 


[I  -  A  x  A] 


A>  -A  0  ...  0  0 

-a2  A  I  -A  ...  0  0 


-a  -.A 
r-1 


-a  A 
r 


0  0  ...  I  -A 

0  0  ...  0  I 


(3.15) 


vec(BBT)  =  (BT,  [3 1 BT . (3rBT)T  , 


(3.16) 


and 


vec(P)  =  (p],  p2,  -  pj)1 


(3.17) 


where  is  the  column  of  P.  The  r^  x  matrix  [I  -  A  (x  A] 

does  not  have  to  be  inverted.  The  system  of  eouetions  (3.14)  can  be 

2  2 

transformed  so  that  the  r  x  r  coefficient  matrix  is  block  lower 

triangular.  In  this  form,  only  an  r  x  r  matrix  inversion  is  required 

to  obtain  ;  the  remaining  columns  of  P  are  obtained  recursively. 

With  w(0|0)  and  P(O)O)  defined,  equations  (3.3)  to  (3.7)  can  be 

computed  for  each  observation  y  =  x^  for  t  =  1,2,  ...,n.  Thus, 

2  2  2 

given  <*,  0  and  a  ,  L  (a,  0,  a  )  and  l  (a,  0,  a  )  can  be  computed  by 
2  2  T 

setting  a  =  o  b^  and  x ( 1 1 1 -1 )  =  z  w(t|t-l)  in  equations  (2.7)  and 
2  2 

(2.8).  If  a t  =  b^,  lQ(a,  0,  a  )  =  LQ(a,0)  can  be  maximized  by 

2 

searching  over  a  and  0,  and  a  is  estimated  by 

n 

=  n  2  (V  x(tit_1))2/bf  (3J8) 

t=l 

2  2 
However,  to  maximize  l_c(a,  0,  a  )  requires  searching  over  a,  0,  and  a  ; 

2 

this  follows  by  noting  that  an  explicit  expression  for  s  cannot  be 

obtained  from  equation  (2.10).  In  fact,  the  estimates  a(c),  b(c),  and 
2 

s  (c)  are  interre  lated .  The  Kalman  filter  algorithm  as  presented 

includes  the  possibility  of  observation  error.  Thus,  for  ARMA  models 

with  observation  error  u  *  o,  the  additional  parameter  R,  the 

variance  of  u^,  must  be  estimated.  That  is,  the  log  likelihood  and 

2 

generalized  likelihood  must  be  maximized  with  respect  to  a,  0,  a  , 
and  R.  The  observations  are  y  =■  x^.  +■  u^,  where  x^  is  the  ARMA 
process  of  (2.1)  or  (3.8).  Then,  y  replaces  x  in  (2.7)  and 


(2.8);  replaces  in  (2.8).  The  inclusion  of  observation 

error  in  the  model  can  result  in  a  more  parsimonious  model 
representation  (see  Box  and  Jenkins,  1970). 

Since  1(c)  is  nonlinear  for  ARMA(p,q)  processes,  the  Ellipsoid 

algorithm  of  Kupferschmid  and  Ecker  (1984)  for  nonlinear  optimization 

is  used  to  calculate  the  maximum  of  1(c).  Without  a  priori 

information,  the  initial  a.(c)'s  and  b  ( c )  1  s  are  set  to  zero  and, 

2 

for  c  *  0,  s  (c)  is  set  equal  t*  one  half  +  he  sample  variance  of  the 

data.  An  ellipsoid  about  these  initial  values  must  be  given  such  that 

the  ellipsoid  contains  the  true  parameters.  In  general,  the  algorithm 

will  terminate  at  the  maximum  of  t(c)  inside  the  ellipsoid.  If  it  is 

2 

suspected  that  L(c)  evaluated  at  a(c),  b(c),  and  s  (c)  is  not  the 

global  maximum,  the  algorithm  can  be  applied  again  using  the  current 

2 

values  of  a(c),  b(c),  and  s  (c)  as  the  center  of  a  new  ellipsoid. 

In  general,  the  size  of  the  new  ellipsoid  will  be  smaller  than  the 

previous  ellipsoid.  In  our  experiments,  we  maximized  L ( 0 )  by  searching 
2 

over  a  and  (3;  a  was  estimated  using  equation  (3.18).  These 

2 

estimates  (denoted  a(0),  b(0)  and  s  (0))  were  used  as  initial  values 

for  the  maximization  of  L(c).  The  difference  between  calculating  L ( 0 ) 

and  L(c)  can  be  examined  using  equations  (2.7),  (2.8)  and  (2.11). 

2  2  2 

L ( 0 )  requires  calculating  log  and  (x^-  x ( t | t-1 ) )  /2 a^,  whereas 

0  _o /p  9  9 

L(c)  requires  calculating  (o^.)  and  exp[  -  c(x  -  x(  1 1 1 -1 ) )  /2o£] . 


Since  the  above  are  the  only  variations  between  calculating  1(0)  and 
L(c),  the  same  computer  program  can  be  used  for  both  calculations  with 
a  switch  to  indicate  the  evaluation  of  1(0)  or  L(c). 


To  ensure  that  the  parameter  estimates  satisfy  the  stationarity 
and  invertibi 1 ity  criteria  in  (2.2),  we  reparameterize  in  terms  of  the 
partial  autoregressi ve  coefficients  a^  and  the  partial  moving  average 
coefficients,  b^  which  have  values  in  the  open  interval  (-1,1).  The 
autoregressive  and  moving  average  coefficients  are  calculated  by  the 
I.evinson-Durbin  (1960)  recursion  (as  cited  in  Jones,  1980).  For 
i  =  1,2 . p;  j  =  1,2,...,  q 


and  for  i  >  1 ,  j  >  1 


=  dki_1)  -  dk  di-k1}'  for  k  =  ]-2 .  i-1*  <3-20a 

b(j)  =  b^'15  bj3”1 } ,  for  k  =  1,2 .  j-1  -  (3.20b 

The  autoregress  i  ve  coefficients  are  a^  =  aj^  for  k  =  1,2,  ..., 
p  and  the  moving  average  coefficients  are  b^  =  b^  for 
k  =  1,2 .  q. 


i‘4»*  *1 
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TABLE  3.1 


Maximum  Likelihood  (c  =  0)  and  Model-Critical  (c  /  0) 
Parameter  Estimates  for  a  Simulated  ARMA(2,1)  Process 
with  Innovations  Distributed  N(0,1),  and  No  Outliers 


c 

al 

a2 

bi 

S2 

0.0 

1.5912 

-0.9255 

0.3253 

0.6260 

0.1 

1 .5924 

-0.9272 

0.3366 

0.5983 

0.2 

1  .5929 

-0.9285 

0.3355 

0.5591 

0.3 

1  .5950 

-0.9307 

0.3296 

0.5209 

0.4 

1  .5948 

-0.9303 

0.3205 

0.4945 

TABLE  3. 

2 

Maximum  Likelihood 

(c  =  0)  and 

Model -Critical 

(c  *  0 

Parameter  Estimates  for  a  Simulated  ARMA (2,1) 

Proces 

wi  th 

Innovations 

Di stributed 

t( 5)  ,  and  No  Outl iers 

c 

al 

a2 

bi 

S2 

0.0 

1  .4599 

-0.81 79 

0.5314 

1  .5432 

0.1 

1  .4676 

-0.8214 

0.4963 

1.4018 

0.2 

1  .4756 

-0.8235 

0.4747 

1  .2777 

0.3 

1  .4807 

-0.8274 

0.4550 

1.1761 

0.4 

1  .4836 

-0.8287 

0.4467 

1 .0934 
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This  corresponds  to  the  case  of  innovative  outliers  of  Fox  (1972),  and 

Martin  and  Thomson  (1982).  Table  3.2  presents  the  parameter  estimates 

for  c  -  0,  0.1,  0.2,  0.3,  and  0.4.  As  in  the  case  when  e  is  normal, 

the  ARMA  parameter  estimates  (a^ ,  a^,  )  do  not  change  much  as 

c  increases.  However,  when  e^  has  a  t-distribution,  the  estimate  of 
.  2 

the  innovations  variance  a  decreases  considerably  as  c  increases. 

2 

The  decrease  in  the  value  of  s  (c)  as  c  increases  results  from  the 
downweighting  of  e  values  which  are  in  the  tails  of  the 

t-distribution.  For  c  =  0.3,  Figures  3.3  and  3.4  are  plots  of  the 

2  2 

weights  wt  =  exp  (-  c(x.  -  x ( t | x -t ) )  /a  b^)  for  the  ARMA(2,1)  example 
with  normal  and  t-di stri buted  errors,  respectively.  Also,  the  maximum 
likelihood  residuals  for  these  two  examples  are  shown  in  Figures  3.5 
and  3.6.  In  both  examples,  the  weights  and  residuals  do  not  indicate 
any  obvious  problems  with  the  data. 

Next,  four  additive  outliers  (Fox,  1972)  were  added  to  the  two 

realizations  described  above;  plots  of  the  realizations  are  shown  in 

Figures  3.7  and  3.8.  As  in  the  plot  of  the  AR(4)  process  with 

outliers,  the  outliers  are  not  obvious.  For  c  -  0,  0.1,  0.2,  0.3,  and 

0.4,  Tables  3.3  and  3.4  present  the  parameter  estimates.  In  both 
2 

cases,  b^ ( c )  and  s  (c)  change  considerably  as  c  varies  between  0 
and  0.4.  The  changes  in  the  moving  average  term  b^(c)  result  from 
the  additive  outliers  being  downweighted.  In  both  cases,  the  parameter 
estimate  b^(c)  moves  in  the  direction  of  the  true  value.  The  presence 
of  additive  outliers  in  the  data  breaks  up  the  structure  of  the  moving 
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TABLE  3.3 

Maximum  Likelihood  (c  =  0)  and  Model-Critical  (c  *  0) 
Parameter  Estimates  for  a  Simulated  ARMA (2,1)  Process  with 
Innovations  Distributed  N(0,1),  and  4  Additive  Outliers 


c 

al 

a2 

bi 

S2 

0.0 

1.6063 

-0.9339 

-0.2227 

1  .3875 

0.1 

1 .6040 

-0.9370 

-0.1556 

1  .0855 

0.2 

1.6066 

-0.9433 

-0.0925 

0.8626 

0.3 

1 .6045 

-0.9426 

-0.0381 

0.7237 

0.4 

1.5996 

-0.9403 

0.0226 

0.6309 

TABLE  3.4 

Maximum  Likelihood  (c  =  0)  and  Model-Critical  (c  *  0) 
Parameter  Estimates  for  a  Simulated  ARMA(2,1)  Process  with 
Innovations  Distributed  t(5),  and  4  Additive  Outliers 


c 

al 

a2 

bi 

s2 

0.0 

1 .4903 

-0.8437 

0.0943 

2.4281 

0.1 

1 .5064 

-0.8553 

0.0992 

1  .8670 

0.2 

1.5146 

-0.8625 

0.1263 

1.6718 

0.3 

1 .5222 

-0.8679 

0.1623 

1  .5568 

0.4 

1 .5251 

-0.8690 

0.2088 

1.4104 

.  V. 
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average  part  of  the  process  as  seen  by  b^(c);  as  a 
unexplained  moving  average  variation  is  summarized 


th  Innovations  Distributed  Norma  1 (0 , 1 ) ,  an 
ive  Outliers 


ess  with  Innovations  Distributed  Normal(0,l) 
Four  Additive  Outliers;  c  =  0 . 3 


PART  4 


MODEL-CRITIlAL  ESTIMATION  FOR 
MULTIVARIATE  ARMA  MOOELS 


4. 1  Introduction 

In  this  part,  the  results  of  Part  3  are  extended  to  multivariate 
ARMA  models.  Unlike  univariate  models,  multivariate  ARMA  models  have 
not  received  as  much  attention,  primarily  due  to  the  complexity  of 
multivariate  processes.  Estimation  procedures  for  multivariate  ARMA 
models  have  been  presented  by  Wilson  (1973),  Osborn  (1977)  and 
Nicholls  (1976);  some  theoretical  background  has  been  presented  by 
Dunsmuir  and  Hannan  (1976).  Goodness  of  fit  analysis  for  multivariate 
models  has  received  even  less  attention  (see  Hosking,  1980,  and  Poskitt 
and  Tremayne,  1982).  From  the  examples  in  Parts  2  and  3,  it  can  be 
seen  that  parametric  and  distributional  deficiencies  in  the  data  can  be 
masked  by  the  structure  of  the  process.  This  issue  is  complicated 
further  by  the  interaction  of  the  entries  in  a  vector  process.  As  in 
Parts  2  and  3,  model -critical  procedures  will  provide  a  means  to  assess 
the  adequacy  of  a  multivariate  ARMA  model  by  subjecting  the  data  and 
the  model  to  varying  amounts  of  criticism. 

For  multivariate  observations,  it  is  straightforward  to  extend  the 
Kalman  filter  algorithm  described  in  Section  3.3  for  the  evaluation  of 
the  log  likelihood  and  generalized  likelihood  functions.  The  nonlinear 
optimization  algorithm  of  Kupferschmid  and  Ecker  is  used  to  maximize 
the  likelihood  functions.  As  in  the  univariate  case,  the  same  basic 
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computer  program  can  be  used  to  calculate  the  log  likelihood  and 
generalized  likelihood  functions. 


4.2  Generalized  Likelihood  for  Multivariate  ARMA  Models 


Let  the  m-vectors  x, ,  x_,  ....  x  be  a  realization  of  the 

i  2  n 

autoregressi ve-moving  average,  ARMA(p.q),  process  with  representation 


Vt-k  +  £t  BkCt-k 


(2.1) 


where  is  distributed  multivariate  Gaussian  with  zero  mean  vector  and 
covariance  matrix  D,  and  E[et  e^]  =  0  for  t  *  s .  For  stationarity 
and  invertabi 1 ity,  the  roots  of  the  determinants 


Akz  =  0 


(2.2) 


"Z 


B.z  =  0 
k 


are  assumed  to  lie  outside  the  unit  circle,  jz|  =  1.  Using  the 
multiplication  rule,  the  log  likelihood  Lq(x^,  x^,  ....  x ^ | A ,  B,  D) 
can  be  written  as 
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where 


lQ(x1  ,x2....xn  |  A.B.D)  =  ^  log  ft(xt|  xt_ltA,B,D)  (2-3) 


t=l 


xt_1  =  (X-,  ,x2  ,  ....xt_1). 


A  =  (Ai  ,A2 . Ap) , 


B  =  (BvB2 . Bq), 


t-1 


(2.4) 


f  1  ( x-j  f  Xq,A,B,Q)  -  f^(x^|  A,B,D) , 

ft(xt|  xt_lfA,BfD)  =  I  2-n-F^.l  1/2exp(-  2(xt-x(t|t-l ))  ( x^.-x(  1 1 1-1 ) )  ] , 

x { t  J  t -1 )  =  E[xt|xt_? ,A,B,D],  (2.5) 


and 


F  =  Var[xt| xtl ,A,B,0J  .  (2-6) 

By  using  equations  (2.4)  through  (2.6),  the  log  likelihood  equation 
(2.3)  can  be  written  as 


Lq ( X 1 ' X2  *  ’ 


-.xjA.B.D)  =  -  f  log2ir  -  \  ^  log  |Ft| 


t=l 


(2.7) 


:  V*  ( x  +  -  x(t|t-l))TF  +  '1(x+-x(t|t-l))  . 


1 


The  model -critical  parameter  estimates  are  the  A(c),  B(c),  and  D(c)  which 
maximize  the  generalized  likelihood  function 


LC<X1 *x2* ' ' *Xnl A’B»0)  = 


1 

c 


t  ^Xtl  Xt-1 ,A,S,D* 
oJ(a,b,d.o 


(2.8) 


where 


f t ( x t  |  x^t_i  ,A,B,D)  is  defined  by  (2.4), 

Qt(A,B.D,C)  =  [(Uc)m  (2irFt  j  C  )~W2  ,  (2.9) 

a  =  c/(l+c)  and  Qt(A,B,D,c)  is  the  information  generating  function 
of  the  density  (2.4)  as  defined  by  (2.2.2). 

For  notational  convenience,  f^  and  will  denote  f ^ ( x ^  j  7^  ^ ,A,B,0) 
and  Q  (A,B,D,c),  respectively,  throughout  the  remainder  of  this 
part.  Letting  L(c)  denote  l  (x^,  x^,  ....  xn  A,B,D)  without 
the  constant  term,  the  estimate  for  e  =  (A,B,D)  is  the  solution  e(c)  to 


It  can  be  seen  that  each  term  in  equation  (2.10)  is  weighted  by 
c  a 

f t/0t ;  for  and  Qt  as  defined  above, 


t 

i 

i 


f 


c 

t 


/  uc  \a/2  r 

(|2«M  )  MP  [■ 


c  , 

-  j  (V 


T  r-l 


x(tlt-l))'  Ft ’ (xt-x(t| t-1 )) 


(2.11) 


( 

* 

|  Thus,  data  corresponding  to  large  values  of 

j  (xt-x(tlt-l))TFt1(xt-x(tlt-l))  will  be  downweighted  in  the 

*  estimation  process,  the  degree  of  downweighting  being  determined  by 

the  value  of  c.  Model-critical  estimation  is  a  robust  procedure  that 
filters  out  non-Gaussian  influences  and  finds  the  "best”  multivariate 
ARMA(p.q)  process  consistent  with  Gaussianity.  Unlike  other  robust 
procedures  which  estimate  location  (A,B)  and  scale  (D)  parameters 
separately,  model-critical  procedures  estimate  all  parameters  jointly 

» 

\  using  the  assumed  parametric  and  distributional  form  of  the  model. 

>  Using  the  distributional  as  well  as  the  parametric  model  yields  a 

|  natural  framework  for  analyzing  goodness  of  fit.  The  model-critical 

;  parameter  c  is  similar  to  the  robustness  constant  used  by  other  robust 

l 

[  procedures.  For  c  >  0,  data  inconsistent  with  the  Gaussian  ARMA(p,q) 

C  cl 

I  model  will  receive  small  weights  f  /Q  since  the  quadratic 

)  L  t 

l 

j  form  in  the  exponential  will  be  large.  Values  of  c  >  0  lead  to 

i 

;  criticism  of  outliers  or  heavy  tailed  distributions,  whereas  values  of 

j  c  <  0  lead  to  criticism  of  inliers  or  short  tailed  distributions.  As 

!  the  value  of  |c|  is  increased,  the  procedure  is  more  critical  of  the 

1 

f 


data  and  the  model;  hence,  the  use  of  the  term  model-critical.  If  the 
data  and  the  Gaussian  ARMA(p.q)  model  are  consistent,  then  the 


mod  el-critical  estimates  0(c)  and  the  maximum  likelihood  estimates 
e(0)  will  be  approximately  equal;  this  provides  a  subjective  measure 
of  fit  between  the  model  and  the  data.  Our  experiments  have  shown 
that,  for  an  ARMA(p.q)  process  with  heavy  tailed  innovations,  the 
estimates  A(c)  and  B(c)  will  be  close  to  the  estimates  A( 0)  and  B ( 0 ) 
as  c  increases.  However,  D(c)  and  0(0)  will  differ  considerably  as  c 
increases.  When  additive  outliers  (Fox,  1972)  are  present,  A(c), 

B(c),  and  0(c)  will  differ  from  A(0),  B(0),  and  D(0),  respectively. 

The  additive  outlier  model  (Martin  and  Thompson,  1982)  is 

yt  =  xt  +  vt  (2'12) 

where  x^  is  the  ARMA  process  and  v^  is  the  outlier.  There  are  two 
common  classes  of  additive  outliers.  The  first  is  where  v^.  x  0  for 
few  observations  and  v  is  large  relative  to  e^.  The  second  class 
is  where  v^  *  0  for  every  observation  and  the  process  v^  has  zero 
mean  and  covariance  matrix  R.  In  the  latter  situation,  the  covariance 
matrix  of  v  ,  R,  must  also  be  estimated. 

4.3  The  Kalman  Filter 

The  engineering  literature  abounds  with  applications  of  the  Kalman 
filter  such  as  missile  tracking  and  parameter  estimation.  However, 
the  statistical  literature  contains  fewer  applications.  Harvey  and 
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Phillips  (1979)  and  Jones  (1980)  have  presented  Kalman  filter 
algorithms  to  calculate  the  Gaussian  maximum  likelihood  estimates  for 
univariate  ARMA  models.  Harvey  and  Phillips  (1976)  briefly  discuss  a 
Kalman  filter  algorithm  to  obtain  maximum  likelihood  estimates  for 
multivariate  ARMA  models.  In  general,  the  use  of  equation  (2.10)  to 
estimate  e  is  not  practical  since  expressions  for  alogft/30  and 
3logQt/ae  are  difficult  to  obtain  explicitly  for  ARMA  models. 

Instead,  the  vector  e  is  estimated  using  equation  (2.7)  or  (2.8)  via  a 
Kalman  filter  algorithm.  The  Kalman  filter  is  used  to  calculate  the 
value  of  L(c)  and  L(0)  given  e;  the  estimate  of  e  is  obtained  by 
maximizing  1(c)  or  1(0)  using  nonlinear  optimization  methods. 

The  following  is  a  brief  discussion  of  the  Kalman  filter  algorithm 
which  can  be  found  in  a  number  of  references  (see  Kalman,  1960,  or 
Gelb,  1974).  Let  the  observation  x^  be  defined  by  the  linear  system 

x^  =  ZTst  +  vt  (measurement  equation)  (3.1) 

s^  =  As  i  +  Be  (state  equation)  (3.2) 

where 

xt  is  a  m  x  1  vector  of  observations, 

ZT  is  a  m  x  k  matrix  of  known  values, 


St  is  a  k  x  1  vector  of  unknown  state  parameters. 


*. 


A  is  a  k  x  k  (transition)  matrix. 


B  is  a  k  x  g  matrix  (g  <  m) , 


is  a  id  x  1  vector  of  Gaussian  random  variables  with 

E[vt]  =  0, 


E[vt  vj]  =  R  and  E[v^  vT]  =  0  for  all  i  *  j 


e  is  a  g  x  1  vector  of  Gaussian  random  variables  with 
E[ct]  =  0, 


E[£t  =  C  and  E[c^  el]  =  0  for  i  *  j,  and 


E[e^  vj ]  =  0  for  all  i  and  j . 


Given  the  measurements  x^ ,  x,,,  ....  1 ,  let  s ( t-1 I t-1 ) 

denote  the  minimum  mean  square  estimate  of  s^  1 .  Let  P( t-1 I t-1 ) 
denote  the  estimation  error  covariance  matrix  where  P(t-l|t-l)  is 
known.  That  is, 


E[(st_1  -  s(t-l|t-l))(st_1  -  s( t-1 I t-1 ) ) 1 ]  =  P(t-l|t-l) 


Let  s(t|t-l)  and  P(tlt-l)  denote  the  predicted  value  of  s^  and  its 
correspond i ng  error  covariance  matrix  given  x^ ,  x^,  x^_^ ; 

these  quantities  are  given  by  the  prediction  equations 


y| 


s(tit-i)  =  As(t-m-i) 


P(tlt-l)  =  AP( t— 1  1 1 — 1  ) AT  4-  BCBT 


( ■i’-i) 


(3.4) 


Using  the  information  in  the  tth  observation  xt,  s(tlt)  and  P(tlt) 
are  obtained  by  the  update  equations 


s(t|t)  =  s(tlt-l)  +  P(t|t-l)Z  Ft1(xt  -  ZTs(t|t-D)  (3-5) 


P(t|t)  =  P(t|t-1)  -  P( t ! t -1 ) Z  Ft  Z  P(tlt-l) 


(3.6) 


where 


Ft  =  Z  P(t 1 1-1  )Z  4-  R 


The  ARMA  (p.q)  model  in  equation  (2.1)  can  be  rewritten  as 


(3.7) 


<t =  X  AkXt-k  +  +  X 


Bkct-k 


(3.8) 


where  r  =  max(p,q4-l).  Alternately,  equation  (3.8)  can  be  expressed  as 

the  first  order  system 


\ 


H 


where  the  first  m  entries  of  s.  are  the  entries  of  x.  ,  I  is  the 

t  t  m 

m  x  m  identify  matrix,  and  0  is  the  m  x  m  matrix  of  zero  entries. 
Equation  (3.9)  is  the  state  equation  (3.2)  in  the  state  space 
formulation  of  the  ARMA(p,q)  model.  The  measurement  equation 
correspondi ng  to  (3.1)  is 


*t  -  2  st  *  vt 


(3.10) 


where 


Z  -  (Im,  0 . 0)  . 


There  are  r  (m  x  m)  matrix  entries  in  L  and  the  product  L  s^  is 

equal  to  the  first  m  entries  of  s^ .  In  the  above  formulation,  if 

A.,  A . A  ,  B  ,  B . B  C,  R,  s(OlO)  and  P(0|0) 

12  r  1  2  r-1 

are  available,  equations  (3.3)  to  (3.6)  can  be  used  to  compute 

s(t|t-l),  P(tlt-l),  s(tlt)  and  P(t|t)  for  t  =  1,2 _ ,n.  The 

quantities  F  and  ( x ^  -  Z^s(tlt-l))  are  used  in  equations  (2.7) 


nuim v*  w-hat*  vu  vw  ^ uvuv ftr\. v\ V.  VTluni^Y^r*  r^r^Y^YV.^Y^,'^7\'r ?w  VTOv^Y -V'/V  /V yirgyyyyTyyjr*jny  7;^, 


and  (2.8)  to  evaluate  L(0)  and  L(c).  For  the  model  of  equation  (2.1), 
R  =  0  and  C  =  0  where  all  the  matrices  are  m  x  m. 


In  order  to  start  the  recursions  in  equation  (3.3)  to  (3.6), 

initial  values  are  needed  for  s ( 0 j 0 )  and  P(0|0).  The  value  of  s ( 0 1 0 ) 

is  set  equal  to  zero  since  this  is  the  minimum  mean  square  error 

estimate  of  s  .  From  the  stationarity  of  the  first  order 
0 

autoregressive  process  s^,  the  value  of  P(0|0)  is  the  solution  Pq  of 


P  =  TP  TT  +  UCUT 
0  0 


(3.11a) 


where 


Ar_i  0 


(3.11b) 


(3.11c) 


WtvS?iv?viSri 


Harvey  and  Phillips  (1979)  point  out  that  P( 0 | 0)  can  be  found  by 
solving  the  linear  system 


[I  -  T  ®  T]  vec(Po)  =  vec ( UCU  ) 


(3.12) 


for  Pq  (see  Neudecker,  1969).  The  symbol  ®  indicates  the  Kronecker 
product  and  the  symbol  vec(«)  indicates  forming  a  vector  from  the 
columns  of  the  matrix.  For  example, 


vec(C)  = 


where  is  the  i  column  of  C.  Using  the  definition  of  the 
Kronecker  product  and  noting  the  form  of  T  in  (3.11b),  [I  -  T  (x)  T]  can 
be  written  as 


(I  -  A1  (x)  T)  T 

-A,  ®  T  I 


-Ar-1®T 
-Ar®  T 


-I  ®  T 
m  w 


(3.13) 


-I®  1 
m 


where  I  is  the  rm  identity  matrix.  It  is  easy  to  see  that  3.13) 
can  be  transformed  into  the  lower  block  diagonal  form  of  (3.14). 


Performing  the  same  operations  to  (3.15)  that  were  used  to  transform 
(M3)  to  (3.14)  yields 


b  +  ( I  (x)  T )  b 
r-1  ^  r 

b 

r 

Now  P  is  an  rm  x  r m  matrix  which  we  write  as 


«/here  P.j  is  an  m  x  m  matrix  and  P-j  =  P j  . 


Let  P. 


vec 


then  using  (3.14)  and  (3.17)  we  can 


recursively  solve  for  the  vector  P^,  i  =  1,  2,  ....  r.  i ha t  is 


P1  "  [I 


I  l 

-  ^2  (ai  ^ il)]_1  [  S  {i ® j1 


)  b,] 


i 

P2  =  [(Ai®T1_1)  P1  +  (I®T1_2)  bi] 


(3.18 


=  Y1  C(Ai®^'k+1)  P1  +  (i®Tl"k)  V 


P  =  (A  (x)  T)  P.  +  b  . 
r  '  r  ^  '  1  r 


2  2 

From  the  above  recursion,  it  can  be  seen  that  only  a  rm  x  rm 

2  2 

matrix  inversion  is  required  rather  than  a  (rm)  x  (rm)  matrix 
inversion. 


With  A,  8,  D,  s(0|0)  and  P( 0 1 0)  available,  L(0),  (2.6),  and  L(c), 
(2.7),  can  be  evaluated  using  equations  (3.3)  to  (3.7)  and  can  be 
maximized  by  searching  over  A,  B,  and  D.  Since  1(c),  c  >  0,  is 
nonlinear  for  ARMA(p,q)  models,  1(c)  is  maximized  using  the  Ellipsoid 
algorithm  of  Ecker  and  Kupferschmid  (1984)  for  nonlinear  optimization. 
Without  a  priori  information,  the  initial  values  for  A(c)  and  B(c)  are 
set  equal  to  zero  and  0(c)  is  set  equal  to  one-half  the  sample 
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covariance  of  the  data.  An  ellipsoid  about  these  initial  values  must 
be  given  such  that  the  ellipsoid  contains  the  optimal  parameters.  In 
general,  the  algorithm  will  terminate  at  the  maximum  of  L(c)  inside  the 
ellipsoid.  If  it  is  suspected  that  L(c)  evaluated  at  A(c),  B(c)  and 
D(c)  is  not  the  global  maximum,  the  algorithm  can  be  applied  again 
using  the  current  values  of  A(c),  B(c)  and  D(c)  as  the  center  of  a  new 
ellipsoid;  the  size  of  the  ellipsoid  depends  on  how  close  the 
experimenter  believes  the  current  parameter  values  are  to  the  optimal 
parameter  values.  In  our  experience,  the  maximum  likelihood  estimates 
A ( 0 ) ,  B ( 0 )  and  D( 0)  obtained  from  L(0)  are  used  as  the  initial  values 
of  A(c),  B(c)  and  0(c)  for  the  maximization  of  L(c).  This  usually 
results  in  fewer  iterations  being  required  to  obtain  the  final 
estimates  of  A(c),  B(c)  and  0(c). 

The  innovations  covariance  matrix  D  is  not  estimated  directly; 

since  0  is  positive  definite,  D  can  be  expressed  as  D  =  LLT  where  L 

is  lower  triangular.  The  matrix  L  is  estimated  by  the  algorithm  and 

=  LLT.  This  procedure  eliminates  the  need  to  add  a  constraint  to 

en  e  that  D  be  positive  definite.  However,  constraints 

may  be  needed  to  ensure  that  [Fj  >0  for  t  =  1,2,...,  t  for 

t  depending  on  p  and  q.  Since  the  Ellipsoid  algorithm  is  a 
pq 

constrained  optimization  procedure,  the  above  constraints  are  easily 

incorporated  by  defining  the  constraints  g  (A,B,D)  =  - 1 F ^ |  <  0  for 

t  =  1,  2 . t  These  constraints  will  aid  in  obtaining 

pq 

parameter  estimates  A(c)  and  B(c)  corresponding  to  a  stationary  process 


For  a  multivariate  ARMA  process,  model  identification  is  a  problem 
since  the  representation  (2.1)  is  not  necessarily  unique.  A  discussion 
of  identification  and  canonical  forms  for  multivariate  processes  can  be 
found  in  Hannan  (1969),  Mayne  (1972),  Denham  (1974),  and  Dunsmuir  and 
Hannan  (1976).  Identi f iabi 1 ity  constraints  can  be  added  to  the 
Ellipsoid  algorithm  to  assure  a  unique  representation  (2.1). 

Examination  of  (2.7),  (2.8)  and  (2.1i)  reveals  that  evaluations  of 
(2.11)  and  log  | F  |  are  the  only  differences  in  calculating  L(c)  and 
L(0);  hence,  the  same  basic  computer  program  can  be  used  to  calculate 
L(0)  or  L(c) . 

The  Kalman  filter  formulation  allows  for  observation  errors.  That 

is,  x^  is  equal  to  the  true  ARMA  process  plus  the  observation  errors 

v  .  In  this  situation,  the  parameter  R,  the  observation  error 

covariance  matrix,  must  also  be  estimated.  Since  R  is  positive 

definite,  R  =  R  RT  where  R^  is  lower  triangular.  As  in  the  estimation 
-  a  a  at 

~  ~  ~  -v.| 

of  D,  R  is  estimated  by  the  algorithm  and  R  =  R  R  .  As  in  the 
univariate  case,  the  use  of  the  observation  noise  in  the  model  can 
result  in  a  more  parsimonious  model. 


The  Kalman  filter  provides  a  natural  way  to  accommodate  missing 
data.  If  an  observation  y  is  missing,  the  state  vector  s(t|t)  is 
set  to  s ( t | t-1 ) ,  the  predicted  state  vector  for  time  t.  Similarly, 
the  error  covariance  matrix  P(tjt)  is  set  to  the  predicted  value 
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P( t | t -1 ) .  Since  is  missing,  the  functions  L(0)  and  L(c)  are  not 
updated  at  time  t.  Clearly,  the  procedure  can  be  applied  to  any 
number  of  missing  observations;  however,  if  a  large  number  of 
observations  are  missing,  the  algorithm  will  essentially  reinitialize 
at  the  next  available  observation. 


The  estimation  of  A  and  B  requires  specifying  p  and  q.  In 
general,  p  and  q  are  not  known;  they  must  be  obtained  via  a  selection 
procedure  such  as  the  PSIC  selection  criterion  (2.4.17).  For 
multivariate  processes,  the  PSIC  criteria  becomes 


PS  I C  ( p ,  q )  =  log  |D(  c)  |  +  S(C)  loglogn 


(3.19) 


where  s(c)  >  [(1  +  c ) 2/( 1  +  2c ) ] (m/2  +  For  c  =  0,  the  PSIC 

criteria  reduces  to  the  criterion  of  Hannan  (1981).  With  additive 
outliers  in  the  data,  the  PSIC  selection  criterion  will  select  the 
model  which  describes  the  bulk  of  the  data  since  data  inconsistent 
with  the  model  will  be  downweighted  and  thereby  will  have  a  reduced 
contribution  to  the  model  selected. 


The  computational  burden  of  the  multivariate  Kalman  filter  is 
considerably  greater  than  for  the  univariate  Kalman  filter.  We  have 
taken  advantage  of  the  structure  of  (3.9)  to  eliminate  matrix 
multiplies  involving  zero  matrices.  The  m  x  m  matrix  F  must  be 
inverted  for  each  t.  If  the  process  is  invertible,  the  system  will 
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tend  towards  steady  state  as  t  increases  and  Ft  will  tend  toward  the 

A  ,  ~w 

matrix  D.  Consequently,  for  sufficiently  large  t  =  t,  we  can  replace 
-1  A-l  ,  ,  ,  A.  ~ 

i-  1  l.  .  n  1  _ i  Ir-  I  i_..  1  n  I  -i-  ^  ± 


F~  by  D~  and  | Ft|  by  1  D  J  for  t  >  t. 


For  the  ARMA  process  without  observation  noise  ( R  -  0) ,  the 
matrix  P(t|t)  has  the  form 


P(t|t)  = 


m( r-1 ) 


'.(H)  P(tlt) 


for  t  =  1,  2 . n,  where  P( t  It)  is  a  m(r-l)  x  m(r-l)  matrix.  Taking 

advantage  of  the  special  form  of  matrix  T, 


TP(  1 1 1 )  T 


P(t|t) 


m( r-1 ) 


m(  r-1 ) 


which  eliminates  the  need  to  perform  any  matrix  multiplications  in 
obtaining  P(tjt-l)  from  P(t-ljt-l)  via  (3.4). 


4.4  An  Illustrative  Example 

We  present  an  example  to  illustrate  critical  estimation  for 
multivariate  ARMA  models.  Consider  the  two-dimensional  ARMA(lj) 
process  with  representation 


& 
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xt  =  Axt-1  +■  et  t-  Be  (4.1) 

where 


0.8  0.6 

-0.5  0.75 


0.65  -0.4 

0.4  0.75 


and  e^  has  a  multivariate  Gaussian  distribution  with  zero  mean 
vector  and  covariance  matrix  equal  to  the  identity  matrix.  Further, 
E[et  e^]  =  0  for  t  ^  s.  A  realization  of  the  above  process 
containing  120  samples  was  simulated  and  is  shown  in  Figures  4.1a  and 
4.1b.  The  maximum  likelihood  and  model-critical  parameter  estimates 
are  presented  in  Table  4.1  for  c  =  0.025,  0.1,  0.2,  and  0.3.  Since 
the  data  and  the  model  are  internally  consistent,  the  parameter 
estimates  change  little  as  c  increases.  Figure  4.2  is  a  plot  of  the 
critical  weights 

wt  =  exp[-c  (xt  -  x(t|t-l))T  Ft_1  ( x t  -  x ( 1 1 1-1 )/2]  .  (4.2) 

Since  by  definition  the  model -critical  weights  are  a  measure  of  fit 
between  the  data  and  model,  examination  of  the  weights  is  an  integral 
part  of  the  modeling  process.  The  weights  shown  in  Figure  4.2  do  not 
indicate  any  deficiencies  in  the  model. 


Normal 


TABLE  4.1 

Maximum  Likelihood  (c  =  0)  and  Model-Critical  (c  *  0) 
Parameter  Estimates  for  a  Simulated  Bivariate 
ARMA(1,1)  Process  with  Innovations 
Distributed  Normal  NpfO.I) 


To  examine  critical  estimation  of  an  ARMA  process  with  outliers, 
four  additive  outliers  were  added  at  random  to  the  realization 
discussed  above;  plots  of  the  two  series  are  shown  in  Figures  4.3a  and 
4.3b.  The  outliers  were  distributed  multivariate  Gaussian  with  zero 
mean  vector  and  covariance  matrix  21  where  I  is  the  2x2  identity 
matrix.  Each  outlier  is  independent  of  x^  and  the  other  outliers. 
Table  4.2  presents  the  maximum  likelihood  (c  =  0)  and  model-critical 
parameter  estimates  for  the  ARMA(l.l)  process  with  additive  outliers. 
As  in  the  univariate  example,  the  outliers  are  not  obvious  from  the 
plots,  as  can  be  seen  by  comparing  Figures  4.3a  and  4.3b  with  Figures 
4.1a  and  4.1b,  respectively. 

For  this  example,  the  moving  average  and  covariance  matrix 
parameters  are  the  only  parameters  which  change  considerably  as  c 
increases.  For  c  =  0.3,  it  can  be  seen  that  the  model-critical 
parameter  estimates  are  approximately  the  same  as  those  obtained  from 
uncontaminated  realization.  The  change  in  the  moving  average  and 
covariance  matrix  estimates  as  c  increases  from  0  to  0.4  results  from 
the  downweighting  of  the  outliers.  For  c  -  0,  all  the  data  are 
weighted  equally;  this  results  in  the  variation  caused  by  the  outliers 
being  summarized  in  the  covariance  matrix.  The  outliers  break  up  the 
moving  average  part  of  the  process. 

From  another  perspective,  as  c  increases  the  estimation  procedure 


becomes  increasingly  more  critical  of  the  data  and  model.  This 


IGURE  4.3b  A  Simulated  Bivariate  ARMA(l,l)  Process  with  Innovations 
Distributed  Normal  N  (0,1),  and  Four  Additive  Outliers; 


TABLE  4.2 


Maximum  Likelihood  (c  =  0)  and  Model -Critical  (c 
Parameter  Estimates  for  a  Simulated  Bivariate 
ARMA (1,1)  Process  with  Innovations 
Distributed  Normal  N2 ( 0 , I ) , 
and  4  Additive  Outliers 
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criticism  can  be  seen  by  examining  the  critical  weights  which  are 
shown  in  Figure  4.4.  The  small  weights  about  observation  88  indicate 
that  the  model  whicn  describes  the  bulk  of  the  data  does  not  give  a 
good  fit  to  these  observations .  Since  the  observations  in  an  ARMA 
process  are  not  independent,  a  single  additive  outlier  can  result  in 
the  downweighting  of  neighboring  observations.  This  is  seen  in  the 
small  weights  about  observation  88;  observations  87  and  88  are  both 
contaminated  by  additive  outliers.  The  outliers  at  these  observations 
result  in  the  downweighting  of  samples  88,  89,  90,  91  and  92.  This 
downweighting  is  necessary  to  reduce  the  effect  of  the  outlier  on 
x(t|t-l). 

To  examine  the  effects  of  innovative  outliers,  a  realization  of 
(4.1)  was  examined  where  the  entries  of  et  were  independent  and 
identically  t -distributed  random  variables  with  5  degrees  of  freedom. 
Plots  of  the  realization  are  shown  in  Figures  4.5a  and  4.5b.  Table 
4.3  presents  the  maximum  likelihood  and  model-critical  estimates  for 
c  =  0.025,  0.1,  0.2  and  0.3.  The  location  parameters  A(c)  and  8(c) 
are  approximately  the  same  for  c  =  0,  0.025,  0.1,  0.2  and  0.3. 

Flow  ever,  the  covariance  matrix  0(c)  changes  considerably  as  c 
increases . 

These  examples  provide  insight  into  the  analysis  of  ARMA  models. 

If  the  location  and  scale  parameters  are  approximately  constant  over  a 
range  of  c,  then  the  parametric  and  distributional  model  are 


TABLE  4.3 

Maximum  Likelihood  (c  =  0)  and  Model -Critical  (c  /  0) 
Parameter  Estimates  for  a  Simulated  Bivariate 
ARMA(1,1)  Process  with  Innovations 
Distributed  t(5) 


c 

0 

0.025 

0.1 

0.2 

0.3 

all 

0.847 

0.849 

0.853 

0.855 

0.856 

a21 

-0.485 

-0.492 

-0.506 

-0.512 

al  2 

0.577 

0.586 

0.571 

0.567 

0.556 

a22 

0.614 

0.607 

0.606 

0.606 

0.606 

b11 

0.532 

0.537 

0.545 

0.554 

0.561 

b21 

0.376 

0.374 

0.369 

0.375 

b,2 

-0.409 

-0.409 

-0.420 

-0.422 

-0.425 

b22 

0.700 

0.710 

0.730 

0.741 

0.739 

dll 

1.376 

1.356 

1.294 

1.216 

1  .151 

d21 

0.104 

0.093 

0.064 

0.042 

0.033 

d„„ 

1  .353 

1  .296 

1.144 

1  .011 

0.928 

22 

1 
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reasonable.  If  the  location  parameters  remain  approximately  constant 
over  a  range  of  c  values  but  the  scale  parameters  do  not,  then  the 
Gaussian  error  model  is  suspect.  Heavy  tailed  distributions  yield 
variances  that  decrease  as  c  increases,  whereas  the  opposite  occurs 
for  short  tailed  distributions.  Finally,  if  both  scale  and  location 
parameters  change  considerably  over  a  range  of  c  values,  then  the 
presence  of  additive  outliers  is  suspected. 

4.5  Summary 

Model-critical  procedures  have  been  presented  for  the  analysis  of 
multivariate  ARMA  models.  These  procedures  provide  a  means  to  assess 
whether  the  observed  data  and  the  assumed  model  are  internally 
consistent.  A  Kalman  filter  algorithm  is  used  to  obtain  the  model- 
critical  parameter  estimates.  Since  the  samples  are  processed 
individually,  the  algorithm  allows  for  data  inconsistent  with  the 
model  to  be  downweighted  during  the  estimation  process;  the 
inconsistent  data  are  identified  by  the  critical  weights  which  can  aid 
the  modeling  process.  The  PSIC  selection  criterion  can  be  used  to 
select  an  ARMA  model  from  a  set  of  candidate  models.  In  Part  5,  a 
test  for  multivariate  normality  which  compares  D ( 0 )  and  0(c)  will  be 
presented.  It  will  be  shown  that  the  test  can  be  applied  to  residuals 
after  fitting  with  a  linear  model. 
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PARI  5 

A  TEST  FOR  MULTIVARIATE  NORMALITY 
BASED  ON  THE  GENERALIZED  LIKELIHOOD 
AND  DIVERGENCE 


5.1  Introduction 

The  Shapiro-Wilk  test  (Shapiro  and  Wilk,  1965)  for  univariate 
normality  W  is  based  on  a  comparison  of  two  different  estimates  of  the 
variance;  if  the  random  sample  ,  x^,  ....  xn  is  Gaussian,  the 
two  estimators  should  be  close  and  their  ratio  should  be  unity,  apart 
from  sampling  error.  The  distribution  of 
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is  analytically  intractable  and  must  be  developed  by  simulation.  Here 
a.  are  tabulated  constants  (Shapiro.  ,980)  and  x  -  (l/n>£*..  This 
idea  may  be  extended  to  the  multivariate  setting  by  analogy,  i.e.,  by 
determining  two  estimators  of  the  covariance  matrix  and  by  determining 
a  sensible  way  of  comparing  these  estimators.  We  shall  also  require 
that  the  measure  of  closeness  which  institutes  the  test  statistic  be 
such  that  it  is  applicable  to  the  case  of  a  single  sample  without 
structure  as  well  as  the  structured  case.  For  example,  given  the 
concomitant  variables  z.^,  z.^,  ....  z.q,  the  x.  have  a  combined 
systematic  and  error  structure  given  by 


,1<2ir  z12 . z,<,’  V  e2 . v  *  ‘i 
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where  h  is  (tentatively)  functionally  specified,  apart  from  the 


parameters  9  ,  9  ,  ....  9  which  are  to  be  estimated  from  the  data,  and 
l  2  q 

the  c ^  are  independent,  identically  distributed  p-variate  Gaussian 
variates  with  mean  vector  0  and  positive  definite  covariance  matrix  0. 
For  the  case  of  a  single  sample  without  structure,  h  of  (1.2)  reduces 
to  h  =  m,  for  example,  so  that  the  x^  are  p-variate  Gaussian  with 
mean  m  and  covariance  matrix  D,  denoted  N  (m.D),  with  density 


f(x)  =  2w0  l^exp(-  (x  -  m)^D  \x  -  m)/2)  . 


(1.3) 


The  model  representation  allows  for  regression  models,  experimental 
design  models,  general  linear  models,  and  nonlinear  models.  Our  test 
for  p-variate  normality  will  be  constructed  from  consideration  of  the 
divergence  (Kullback,  1959,  Chapters  1,  2,  and  8)  which  is  discussed 
in  the  next  section. 


5 . 2  Information  Statistics 

From  an  information  theory  perspective,  m(o),  D ( 0 )  and  m(c),  D(c) 
of  (2.2.7)  can  be  thought  of  as  the  mean  and  covariance  matrix  for  two 

normal  distributions  with  probability  densities  fQ  and  f  , 

respectively.  The  divergence  J(o,c)  per  observation  between  the  two 

densities  f  and  f  is  defined  by 
0  c 


J(o,c) 


=  ^  (f0(x)  -  fc(x))log(f0(x)/fc(x))dx 


where  Rp  denotes  the  p-dimensional  Euclidean  space  (Kullback,  1959, 
Chapter  1).  The  quantity  J(o,c)  >  0  and  is  equal  to  zero  if  and  only 
if  fQ  =  f  (Kullback,  1959,  Chapter  2).  For  n  independent 
observations,  the  divergence  is 

n 

J(o,c:n)  =  ^  J i ( o , c )  (2. 

i=l 

where  J^o.c)  is  the  divergence  in  the  ith  observation.  If  the 
observations  are  identically  distributed,  then  J(o,c:n)  =  nJ(o,c) 
(Kullback,  1959,  Chapter  2). 

By  definition,  J(o,c)  is  a  measure  of  the  difference  between  two 

distributions  with  densities  f  and  f  .  For  the  Gaussian 

o  c 

probability  densities  f  and  fc,  it  is  straight  forward  to 
evaluate  (2.1),  which  yields 


J(o.c)  =  \  tr[D(o)D(c)  1  *-  D( c ) D( o)  1  ]  -  p 

+-  ~  (m(o)  -  m(  c)  )T[0(  o) _1  +  D(c)-1]  (m(o)  -  m(c))  (2. 

where  "tr"  denotes  the  trace  of  a  matrix.  Let 


J .  =  \  tr[0(o)D(c)  1  •-  D(c)D(o)_1  ]  -  p 


(2. 


(2.5) 


J2  =  \  (m(°>  -  m(c))T[D(o)_1  +  D ( c )  1 ](m(o)  -  m(c>) 

then  J(o,c)  can  be  written  as  J(o,c)  =  The  term  J.|  is  a 

measure  of  the  differences  between  the  covariance  matrices  0(o)  and 
D(c),  whereas  the  term  provides  a  measure  of  the  differences 
between  the  means  m(o)  and  m(c)  with  respect  to  D(o)  and  D(c).  That 
is,  J(o,c)  measures  the  difference  between  two  normal  distributions  by 
comparing  their  means  and  covariances  as  seen  in  (2.3)  through  (2.5). 

If  the  data  . x^  are  multivariate  normal,  then  m(c)  %  m(o) 

and  0(c)  «  D(o);  hence,  J(o,c)  will  be  small.  If  the  data  are  not 
Gaussian,  0(c)  will  differ  considerably  from  D(o);  however,  m(c)  and 
m(o)  may  or  may  not  differ  depending  on  the  nature  of  the  non-normality 
as  seen  by  the  example  in  Section  2.2.  The  expression  for  J(o,c)  in 
(2.2)  is  for  unstructured  Gaussian  data.  For  data  with  additional 
structure  as  in  (1.2),  J(o,c)  can  still  be  written  as  +■  J^, 
where  is  defined  by  (2.4)  and  is  defined  by  (2.5)  with  m(o) 

and  m(c)  replaced  by  h.(e(o))  and  h.(e(c)),  respectively.  The 

A  A  L  A  A  A  A  .  T 

quantity  h.(e)  =  h(zil.  zi2’  '  ziq;  e) ’  where  ®  =  (e-| .  ®2 . eq)  • 

Thus,  the  term  J  is  the  same  for  data  with  or  without  additional 

structure,  but  is  not. 

From  the  above  discussion,  a  family  of  tests  for  mu  1 1 i vari-ate 
normality  indexed  by  c  are 


T-j  ( c )  =  nJ]  =  n/2<jtr[0(o)0(c)  1  +  0(c)0(o)  ]]  -  2p|  .  (2.6) 

For  each  value  of  c,  T^c)  produces  a  test  statistic  similar  to  the 

Shapiro-Wilk  test  statistic.  Since  D ( c )  and  D(o)  are  estimates  of  the 

covariance  matrix  D  for  structured  and  unstructured  models,  the 

definition  of  T^c)  indicates  that  the  effects  of  estimating 

structural  parameters  on  D(c)  and  D(o)  should  approximately  cancel. 

The  same  does  not  apply  to  since  it  depends  on  the  concomitant 

variables  z.,,  z.„,  ....  z.  and  the  model  parameters  ©., ,  ©„,  ©  . 

1 1  1 2  iq  12  q 

For  example,  for  the  univariate  linear  regression  model  h^(©)  =  ©Q  + 
©1zi  =  ©Tv.,  where  ©  =  (©0,©.,)T  and  v.  =  (l,z.)T, 

J2.  =  v ,T(e(o)  -  ©(c))  (s~2(o)  4-  s~2(c))  (©(o)  -  e(c))V, 

for  a  single  observation.  For  the  entire  sample, 
n 

J2  =  viT(e(0)  *  e(c))  +  s~2(c))  (©(o)  -  (©(c))TVi 

i  =  l 

which  shows  that  depends  on  the  predictor  variable:'  v .  .  In 

fact,  for  structured  data,  is  a  measure  of  the  differences 
between  the  predicted  values  h.(©(o))  and  h.(©(c)).  In  Section 
4,  it  will  be  shown  that  T ^ ( c )  is  insensitive  to  the  underlying 
model;  thus,  T ^ ( c )  provides  a  test  of  multivariate  normality  for  the 
residuals  from  an  assumed  model  such  as  a  linear  model. 


Like  the  Shapiro-Wilk  test  statistic,  the  distribution  of  l^c) 
must  be  obtained  via  Monte  Carlo  simulations  because  it  is  otherwise 
intractable.  Since  large  values  of  T^c)  >  0  indicate  non-normality, 
only  the  upper  percentage  points  of  the  test  statistic  are  required. 
Tables  5.1  to  5.8  (a-e)  contain  percentage  points  of  T^c)  for 
p  =  1(1)6,  8,  and  10;  sample  sizes  n  =  10,  20,  24,  30,  40,  60,  and 
120;and  c  values  dependent  on  the  dimension  p.  For  p  =  5,  6,  8,  and 
10,  the  smallest  sample  size  used  was  n  =  24.  For  each  n  and  p,  the 
percentage  points  of  T^(c)  via  Monte  Carlo  simulation  were  based  on 
10,000  samples  from  Np(m,D).  Since  the  procedures  are  affine 
invariant  (Delaney,  1979),  m  =  0  and  D  =  I  were  used  in  the  analysis. 
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TABLE  5.1a.  Upper  Tail  Percentage  Points  for  the  Statistic 

T-,  (c)  x  TOO,  p  =  T 

PERCENT 


n 

75 

80 

85 

90 

95 

97.5 

99 

10 

0. 

,396 

0. 

.448 

0. 

,505 

0. 

,589 

0 

,717 

0. 

876 

1 

.69 

20 

0. 

,505 

0. 

.588 

0. 

.693 

0. 

,838 

1 

.19 

2. 

17 

4 

.43 

24 

0. 

.542 

0. 

.636 

0 

756 

0. 

,930 

1 

,34 

2. 

34 

4 

.97 

30 

0. 

,565 

0, 

.674 

0. 

,807 

1 . 

,01 

1 

,47 

2. 

55 

5 

.23 

40 

0. 

,625 

0. 

.750 

0. 

.899 

1 . 

,16 

1 

.75 

3. 

04 

5 

.69 

60 

0. 

,684 

0. 

.822 

1 . 

.00 

1 . 

,28 

1 

.91 

3. 

20 

6 

.38 

120 

0. 

,780 

0. 

.947 

1 . 

,170 

1 . 

,52 

2 

,21 

3. 

33 

5 

.63 

10 

0. 

,254 

0. 

.288 

0, 

,325 

0. 

,377 

0 

,457 

0. 

5<‘  ‘ 

0. 

.99 

20 

0. 

.324 

0. 

.377 

0, 

,444 

0. 

.538 

0 

.751 

1  . 

34 

2 

.72 

24 

0, 

.349 

0, 

.409 

0, 

.485 

0. 

.596 

0 

.844 

1  . 

44 

3 

.05 

30 

0. 

.362 

0. 

.433 

0. 

.518 

0. 

,647 

0 

.926 

1  . 

59 

3. 

.25 

40 

0. 

,402 

0. 

.482 

0, 

.578 

0. 

.739 

1 

.11 

1  . 

90 

3, 

.62 

60 

0. 

,440 

0, 

.529 

0. 

.643 

0. 

.821 

1 

.22 

2. 

02 

4 

.07 

120 

0. 

.505 

0. 

.613 

0, 

.755 

0. 

.979 

1 

,42 

2. 

16 

3 

.61 

10 

0 

.143 

0, 

.163 

0. 

.184 

0, 

.213 

0, 

.256 

0 

302 

0 

.517 

20 

0 

.183 

0 

.213 

0, 

.250 

0, 

.303 

0 

.418 

0 

723 

1 

.45 

24 

0, 

.197 

0. 

.231 

0, 

,273 

0. 

.335 

0. 

.471 

0 

780 

1 

.64 

30 

0. 

.206 

0. 

.244 

0. 

.293 

0. 

365 

0. 

.516 

0 

874 

1 

.78 

40 

0. 

.228 

0 

.272 

0, 

,326 

0. 

416 

0. 

.615 

1 

06 

1 

.96 

60 

0. 

.250 

0. 

.299 

0, 

.363 

0. 

,463 

0. 

.685 

1 

14 

2 

.26 

120 

0. 

.287 

0. 

,347 

0, 

429 

0. 

,555 

0, 

,805 

1 

23 

2 

.05 

10 

0. 

.0998 

0 

.113 

0, 

,128 

0. 

,148 

0, 

,177 

0. 

,210 

0 

.347 

20 

0 

.127 

0 

.148 

0, 

.174 

0. 

.211 

0 

.289 

0. 

,493 

0 

.983 

24 

0 

.137 

0 

.161 

0 

.189 

0, 

,233 

0 

.325 

0. 

,531 

1 

.11 

30 

0, 

.143 

0 

.170 

0, 

.203 

0. 

,253 

0, 

.360 

0. 

,596 

1 

.22 

40 

0 

.159 

0 

.189 

0 

.228 

0, 

.289 

0 

.429 

0. 

.729 

1 

.36 

60 

0 

.174 

0 

.209 

0 

.253 

0, 

.323 

0. 

.476 

0. 

,788 

1 

.56 

120 

0. 

.200 

0 

.242 

0, 

,299 

0, 

.387 

0, 

.561 

0. 

,849 

1 

.42 

WWi 
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TABLE  5.1b.  Upper  Tail  Percentage  Points  for  the  Statistic 

T-j  (c)  x  1000,  p  =  1 


PERCENT 


95  97.5 


0 

0.1 

0 

0.2 

4 

0.2 

0 

0.2 

0 

0.2 

0 

0.2 

0 

0.3 

34 

0. 

,160 

0. 

.185 

90 

0. 

,253 

0, 

.389 

10 

0. 

,284 

0, 

.423 

29 

0. 

,316 

0. 

.490 

61 

0. 

,379 

0, 

.617 

209 

0. 

,238 

0. 

,284 

0. 

.329 

282 

0. 

,338 

0. 

449 

0. 

.685 

307 

0. 

,374 

0. 

.504 

0. 

.746 

329 

0. 

,407 

0. 

.562 

0. 

.865 

371 

0. 

,463 

0. 

.673 

1  . 

.09 

412 

0. 

.523 

0. 

.759 

1  . 

.22 

496 

0. 

.632 

0. 

.899 

1  . 

.39 

TABLE  5. Id.  Upper  Tail  Percentage  Points  for  the  Statistic 

T-i(c)  x  TOO,  p  =  1 

PERCENT 


-0.025 


n 

75 

80 

85 

90 

95 

97.5 

99 

10 

0 

103 

0 

116 

0 

131 

0 

150 

0 

177 

0 

205 

0 

248 

20 

0 

130 

0 

150 

0 

177 

0 

213 

0 

279 

0 

407 

0 

782 

24 

0 

141 

0 

164 

0 

193 

0 

234 

0 

312 

0 

445 

0 

912 

30 

0 

148 

0 

175 

0 

207 

0 

255 

0 

348 

0 

522 

1 

04 

40 

0 

163 

0 

194 

0 

233 

0 

289 

0 

419 

0 

659 

1 

22 

60 

0 

180 

0 

214 

0 

260 

0 

328 

0 

472 

0 

754 

1 

49 

120 

0 

208 

0 

252 

0 

314 

0 

398 

0 

571 

0 

872 

1 

47 

10 

0 

148 

0 

167 

0 

189 

0 

216 

0 

256 

0 

296 

0 

351 

20 

0 

188 

0 

217 

0 

255 

0 

307 

0 

402 

0 

576 

1 

10 

24 

0 

204 

0 

237 

0 

278 

0 

338 

0 

450 

0 

632 

1 

29 

30 

0 

213 

0 

253 

0 

299 

0 

367 

0 

502 

0 

738 

1 

48 

40 

0 

236 

0 

280 

0 

337 

0 

416 

0 

601 

0 

939 

1 

74 

60 

0 

260 

0 

308 

0. 

375 

0 

475 

0 

679 

1 

08 

2 

13 

120 

0 

302 

0 

364 

0 

454 

0 

575 

0 

824 

1 

26 

2 

12 

10 

0 

266 

0 

298 

0 

338 

0 

385 

0 

458 

0 

522 

0 

608 

20 

0 

336 

0 

388 

0. 

455 

0 

548 

0 

711 

0 

987 

1 

86 

24 

0 

362 

0 

424 

0. 

497 

0 

604 

0 

793 

1 

09 

2 

21 

30 

0 

381 

0 

451 

0 

534 

0 

656 

0 

890 

1 

27 

2 

53 

40 

0 

421 

0 

502 

0 

599 

0 

739 

1 

06 

1 

63 

3 

04 

60 

0 

466 

0 

552 

0 

670 

0 

843 

1 

21 

1 

89 

3 

75 

120 

0 

542 

0 

656 

0 

815 

1 

03 

1 

48 

2 

25 

3 

77 

10 

0 

418 

0 

470 

0 

531 

0 

604 

0 

720 

0 

820 

0 

924 

20 

0 

527 

0 

608 

0 

715 

0 

859 

1 

10 

1 

50 

2 

79 

24 

0 

570 

0 

666 

0 

782 

0 

943 

1 

24 

1 

65 

3 

32 

30 

0 

598 

0 

709 

0 

837 

1 

03 

1 

38 

1 

95 

3 

83 

40 

0 

664 

0 

790 

0 

941 

1 

16 

1 

65 

2 

49 

4 

65 

60 

0 

733 

0 

868 

1 

05 

1 

32 

1 

89 

2 

92 

5 

73 

120 

0 

854 

1 

04 

1 

28 

1 

62 

2 

31 

3 

55 

5 

87 

m 
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TABLE  5 . 1 e .  Upper  Tail  Percentage  Points  for  the  Statistic 

Ti(c).  p  =  1 


PERCENT 


c 

n 

75 

80 

85 

90 

95 

97.5 

99 

0.3 

10 

0.133 

0.152 

0.181 

0.233 

0.745 

2.11 

8.16 

20 

0.173 

0.205 

0.248 

0.338 

0.770 

1  .46 

3.19 

24 

0.181 

0.215 

0.267 

0.361 

0.767 

1.42 

2.90 

30 

0.190 

0.231 

0.284 

0.379 

0.767 

1.41 

2.46 

40 

0.200 

0.243 

0.304 

0.402 

0.746 

1.32 

2.20 

60 

0.213 

0.259 

0.320 

0.413 

0.647 

1.11 

1  .83 

120 

0.220 

0.275 

0.343 

0.449 

0.669 

0.961 

1.50 

0.2 

10 

20 

24 

30 

40 

60 

120 

0.0592 

0.0782 

0.0821 

0.0872 

0.0924 

0.0999 

0.0106 

0.0677 

0.0919 

0.0975 

0.104 

0.112 

0.121 

0.131 

0.0784 

0.110 

0.119 

0.127 

0.138 

0.149 

0.164 

0.0954 

0.140 

0.153 

0.164 

0.182 

0.192 

0.212 

0.177 

0.269 

0.286 

0.311 

0.317 

0.295 

0.322 

0.476 

0.544 

0.559 

0.577 

0.583 

0.523 

0.471 

1.33 

1  .29 
1.24 
1.13 

1  .03 

0.928 

0.748 

0.1 

10 

0.0153 

0.0174 

0.0198 

0.0234 

0.0307 

0.0522 

0.116 

20 

0.0199 

0.0234 

0.0279 

0.0342 

0.0510 

0.0951 

0.232 

24 

0.0210 

0.0247 

0.0297 

0.0374 

0.0556 

0.113 

0.253 

30 

0.0227 

0.0269 

0.0321 

0.0405 

0.0660 

0.126 

0.263 

40 

0.0242 

0.0292 

0.0352 

0.0448 

0.0723 

0.134 

0.248 

60 

0.0265 

0.0317 

0.0390 

0.0499 

0.0741 

0.132 

0.236 

120 

0.0285 

0.0355 

0.0455 

0 . 05b9 

0.0848 

0.127 

0.241 

-0.1  10 

0.0171 

0.0193 

0.0220 

0.0249 

0.0296 

0.0336 

0.0382 

20 

0.0221 

0.0254 

0.0296 

0.0351 

0.0439 

0.0553 

0.0930 

24 

0.0231 

0.0269 

0.0315 

0.0387 

0.0497 

0.0635 

0.112 

30 

0.U253 

0.0296 

0.0348 

0.0418 

0.0542 

0.0713 

0.135 

40 

0.0270 

0.0319 

0.0380 

0.0469 

0.0629 

0.0908 

0.168 

60 

0.0303 

0.0361 

0.0434 

0.0546 

0.0747 

0.110 

0.199 

120 

0.0341 

0.0415 

0.0516 

0.0650 

0.0946 

0.136 

0.244 
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ft- 


.305 

0.332  C 

.440 

0.496  C 

.464 

0.527  C 

.512 

0.593  C 

.558 

0.655  C 

.640 

0.746  C 

.760 

0.869  1 

.  181 
.254 
0.268 
0.294 
0.317 
0.366 
0.429 


.196 
.282 
0.296 
0.327 
0.357 
0.411 
0.488 


0.212 

0.317 

0.336 

0.378 

0.419 

0.478 

0.558 


.235 
.381 
0.408 
0.468 
0.510 
0.591 
0.674 


0.0634 

0.0671 

0.0733 

0.0794 

0.0918 

0.108 


TABLE  5  2b.  Upper  Tail  Percentage  Points  for  the  Statistic 

T-j  ( c )  x  1000,  p  =  2 


c 

n 

75 

80 

PERCENT 

85 

90 

95 

97.5 

99  L 

«l 

0.008 

10 

0.290 

0.314 

0.339 

0.374 

0.436 

0.518 

i 

0.696  •; 

20 

0.406 

0.449 

0.504 

0.600 

0.909 

1.37 

2.19  5 

24 

0.429 

0.474 

0.538 

0.642 

0.935 

1.47 

2.32 

30 

0.469 

0.523 

0.604 

0.739 

1.13 

1.73 

2.92  » 

40 

0.509 

0.572 

0.665 

0.808 

1.23 

1.89 

2.94  \ 

60 

0.588 

0.659 

0.762 

0.943 

1.44 

2.19 

3.43  > 

120 

0.622 

0.787 

0.896 

1  .08 

1.58 

2.33 

3.6!  \ 

0.006 

10 

0.163 

0.176 

0.191 

0.211 

0.245 

0.289 

i 

0.387  * 

20 

0.228 

0.252 

0.283 

0.336 

0.506 

0.765 

1  .21  l 

24 

0.241 

0.266 

0.302 

0.360 

0.521 

0.820 

1.29  R 

30 

0.264 

0.294 

0.340 

0.414 

0.632 

0.964 

1  -  62  l 

40 

0.287 

0.322 

0.374 

0.453 

0.689 

1  .06 

1.65  { 

60 

0.331 

0.371 

0.429 

0.529 

0.807 

1  .23 

1.92 

120 

0.390 

0.443 

0.504 

0.611 

0.891 

1.31 

2.03  L 

J 

0.004 

10 

0.0726 

0.0782 

0.0849 

0.0936 

0.109 

0.127 

J 

0.170  § 

20 

0.101 

0.112 

0.126 

0.149 

0.223 

0.337 

0.533  5 

24 

0.107 

0.118 

0.134 

0.160 

0.230 

0.361 

0.568 

30 

0.118 

0.131 

0.151 

0.184 

0.279 

0.425 

0.714 

40 

0.127 

0.143 

0.166 

0.201 

0.305 

0.468 

0.727 

60 

0.147 

0.165 

0.191 

0.234 

0.358 

0.542 

0.851  -1 

120 

0.173 

0.197 

0.225 

0.272 

0.396 

0.582 

0.900  5 

j 

0.002 

10 

0.0181 

0.0195 

0.0212 

0.0234 

0.0272 

0.0315 

0.0418 

20 

0.0253 

0.0280 

0.0313 

0.0371 

0.0552 

0.0834 

o.i32  ■: 

24 

0.0268 

0.0296 

0.0335 

0.0399 

0.0572 

0.0895 

0.141 

30 

0.0294 

0.0327 

0.0376 

0.0458 

0.0692 

0.106 

0.177 

40 

0.0318 

0.0358 

0.0416 

0.0501 

0.0758 

0.116 

0.181  0 

60 

0.0368 

0.0412 

0.0477 

0.0585 

0.0894 

0.135 

0.212 

120 

0.0434 

0.0493 

0.0561 

0.0682 

0.0991 

0.146 

0.225  ' 

3 

• 

f 

J 

> 

t.ft'A.  •*  r  rt.a1  4 


TABLE  5.2c.  Upper  Tail  Percentage  Points  for  the  Statistic 

T-j  (c)  x  1000,  p  =  2 


PERCENT 


0.0182  0.0196  0.0212  0.0233 

0.0253  0.0279  0.0313  0.0369 

0.0268  0.0296  0.0335  0.0398 

0.0294  0.0327  0.0376  0.0456 

0.0319  0.0359  0.0416  0.0501 

0.0369  0.0411  0.0476  0.0585 

0.0435  0.0495  0.0562  0.0685 


95  97.5 


0.0233  0.0271 
0.0369  0.0543 
0.0398  0.0562 
0.0456  0.0685 
0.0501  0.0754 
0.0585  0.0892 
0.0685  0.0992 


0.0310  0.0406 

0.0817  0.128 


0.0882 

0.104 

0.115 

0.135 

0.146 


0.138 

0.173 

0.179 

0.210 

0.227 


s 

|  -0.004 

i 

10 

20 

24 

30 

40 

60 

120 

0.0726 

0.101 

0.107 

0.118 

0.128 

0.148 

0.174 

0.0784 

0.112 

0.118 

0.131 

0.143 

0.165 

0.198 

0.0849 

0.125 

0.134 

0.150 

0.166 

0.190 

0.225 

0.0932 

0.147 

0.159 

0.182 

0.200 

0.234 

0.274 

0.108 

0.216 

0.224 

0.273 

0.300 

0.356 

0.397 

0.124 

0.324 

0.349 

0.416 

0.459 

0.536 

0.586 

0.160 

0.508 

0.549 

0.686 

0.711 

0.839 

0.911 

<  -0.006 

10 

0.164 

0.176 

0.191 

0.210 

0.243 

0.277 

0.355 

20 

0.228 

0.251 

0.281 

0.331 

0.483 

0.720 

1.13 

s 

24 

0.241 

0.266 

0.301 

0.356 

0.501 

0.779 

1.23 

1 

30 

0.265 

0.294 

0.338 

0.408 

0.610 

0.930 

1.53 

J 

40 

0.288 

0.322 

0.374 

0.450 

0.674 

1  .02 

1  .59 

^  M  1 

60 

0.332 

0.371 

0.428 

0.526 

0.801 

1.20 

1.88 

I 

120 

0.393 

0.446 

0.507 

0.617 

0.895 

1  .32 

2.06 

1  -0.008 

10 

0.291 

0.314 

0.340 

0.373 

0.432 

0.492 

0.621 

• 

20 

0.405 

0.446 

0.500 

0.588 

0.853 

1.27 

1  .98 

7-v  mi 

24 

0.428 

0.472 

0.535 

0.632 

0.888 

1.37 

2.16 

HI 

30 

0.471 

0.522 

0.600 

0.724 

1  .08 

1.65 

2.69 

El 

40 

0.512 

0.574 

0.664 

0.799 

1  19 

1.82 

2.80 

60 

0.591 

0.660 

0.760 

0.934 

1  .42 

2.13 

3.34 

1 

1 

» 

120 

0.699 

0.794 

0.901 

1.10 

1  .60 

2.34 

3.67 

y 

TABLE  5. 2d.  Upper  Tail  Percentage  Points  for  the  Statistic 

T-i (c)  x  100,  p  =  2 

PERCENT 


1 

fc* 

rl»j 

S 


0. 

.183 

0. 

.253 

0. 

.268 

0. 

.295 

0, 

.321 

0 

.370 

0. 

.440 

.196 

0.213 

.278 

0.310 

.296 

0.334 

.327 

0.375 

.360 

0.412 

.414 

0.476 

.499 

0.568 
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TABLE  5.2e.  Upper  Tail  Percentage  Points  for  the  Statistic 

T-|  (c) ,  p  =  2 

PERCENT 

c  n  75  80  85  90  95  97.5  99 


10 

0.546 

0.875 

1.72 

5.65 

14.4 

31  .5 

68.0 

20 

0.664 

0.845 

1.20 

1  .84 

3.51 

5.82 

10.7 

24 

0.687 

0.854 

1.15 

1.68 

2.91 

4.75 

8.97 

30 

0.716 

0.857 

1  .14 

1  .64 

2.75 

4.16 

6.91 

40 

0 . 740 

0.870 

1.10 

1.56 

2.40 

3.45 

5.22 

60 

0.776 

0.895 

1.10 

1  .43 

2.10 

2.95 

4.14 

120 

0.792 

0.901 

1  .05 

1.31 

1.82 

2.45 

3.19 

10 

194 

■a 

223 

0 

291 

0 

505 

1 

53 

1 

14 

2 

26 

267 

0 

316 

0 

409 

0 

619 

1 

24 

2 

.06 

4 

EE 

24 

287 

0 

334 

0 

422 

0 

628 

1 

.14 

1 

.92 

3 

42 

30 

305 

0 

356 

0 

446 

0 

650 

1 

13 

1 

.85 

3 

EEI 

328 

0 

385 

0 

469 

0 

642 

1 

07 

1 

57 

2 

54 

Wm 

352 

0 

404 

0 

484 

0 

643 

El 

970 

1 

.38 

1 

98 

120 

0 

374 

El 

427 

0 

498 

0 

610 

0 

869 

1 

19 

1 

69 

10 

0.0457 

0.0499 

0.0555 

0.0650 

0.106 

0.170 

0.273 

20 

0.0628 

0.0705 

0.0827 

0.110 

0.203 

0.331 

0.655 

24 

0.0686 

0.0775 

0.0901 

0.121 

0.207 

0.345 

0.603 

30 

0.0737 

0.0848 

0.101 

0.136 

0.230 

0.362 

0.635 

40 

0.0820 

0.0932 

0.111 

0.145 

0.238 

0.355 

0.607 

60 

0.0908 

0.103 

0.122 

0.157 

0.239 

0.350 

0.512 

120 

0.0998 

0.113 

0.132 

0.163 

0.229 

0.321 

0.500 

10 

0.0480 

0.0516 

0.0554 

0.0604 

0.0676 

0.0738 

0.0808 

20 

0.0643 

0.0700 

0.0770 

0.0865 

0.106 

0.134 

0.192 

24 

0.0689 

0.0755 

0.0834 

0.0953 

0.119 

0.159 

0.226 

30 

0.0757 

0.0837 

0.0932 

0.108 

0.137 

0.185 

0.286 

40 

0.0844 

0.0929 

0.105 

0.124 

0.162 

0.224 

0.339 

60 

0.0959 

0.108 

0.122 

0.146 

0.197 

0.278 

0.414 

120 

0.114 

0.128 

0.147 

0.177 

0.243 

0.350 

0.559 

TABLE  5.3a.  Upper  Tail  Percentage  Points  the  Statistic 
T-|  (c)  x  1000,  p  =  3 


PERCENT 


0.025 


0.015 


n 

75 

80 

85 

90 

95 

97 

.5 

10 

0 

560 

0 

591 

0 

628 

0 

681 

0 

785 

0. 

899 

20 

0 

831 

0 

908 

1 

04 

1 

26 

1 

87 

2. 

54 

24 

0 

910 

1 

00 

1 

15 

1 

42 

2 

13 

2 

97 

30 

1 

01 

1 

12 

1 

29 

1 

62 

2 

40 

3. 

40 

40 

1 

11 

1 

24 

1 

45 

1 

80 

2 

67 

3 

67 

60 

1 

26 

1 

41 

1 

65 

2 

03 

2 

89 

4 

09 

120 

1 

44 

1 

61 

1 

85 

2 

22 

2 

96 

3 

95 

10 

0 

358 

0 

377 

0 

401 

0 

433 

0 

496 

0 

560 

20 

0 

531 

0 

579 

0 

656 

0 

799 

1 

16 

1 

57 

24 

0 

579 

0 

640 

0 

725 

0 

900 

1 

33 

1 

85 

30 

0 

642 

0 

716 

0 

819 

1 

02 

1 

50 

2 

13 

40 

0 

710 

0 

795 

0 

916 

1 

14 

1 

68 

2 

32 

60 

0 

802 

0 

900 

1 

05 

1 

29 

1 

84 

2 

61 

120 

0 

925 

1 

03 

1 

18 

1 

42 

1 

90 

2 

54 

10 

0 

201 

0 

212 

0 

225 

0 

242 

0 

276 

0 

308 

20 

0 

298 

0 

324 

0 

365 

0 

442 

0 

640 

0 

858 

24 

0 

325 

0 

359 

0 

405 

0 

499 

0 

730 

1 

01 

30 

0 

361 

0 

400 

0 

458 

0 

569 

0 

830 

1 

17 

40 

0 

399 

0 

446 

0 

511 

0 

635 

0 

932 

1 

28 

60 

0 

451 

0 

506 

0 

590 

0 

723 

1 

03 

1 

46 

120 

0 

522 

0 

584 

0 

667 

0 

800 

1 

07 

1 

44 

10 

0 

0892 

0 

0940 

0 

0997 

0 

107 

0 

121 

0 

134 

20 

0 

132 

0 

143 

0 

161 

0 

194 

0 

277 

0 

370 

24 

0 

144 

0 

158 

0 

178 

0 

218 

0 

318 

0 

435 

30 

0 

160 

0 

177 

0 

202 

0 

250 

0 

362 

0 

508 

40 

0 

177 

0 

197 

0 

226 

0 

279 

0 

411 

0 

564 

60 

0 

232 

0 

260 

0 

296 

0 

356 

0 

454 

0 

643 

120 

0 

232 

0 

260 

0 

296 

0 

356 

0 

476 

0 

639 

0.657 

2.31 

2.70 

3.08 

3.51 

3.99 

3.73 


0.360 
1.25 
1  " 

1 
1 
2 
2 
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TABLE  5.3b. 


Upper  Tail  Percentage  Points  for  the  Statistic 
T-j(c)  x  1000,  p  =  3 


PERCENT 


n 

75 

80 

85 

90 

95 

97.5 

99 

10 

0.571 

0.602 

0.637 

0.684 

0.771 

0.851 

0.982 

20 

0.843 

0.915 

1  .02 

1.23 

1.76 

2.33 

3.36 

24 

0.922 

1  .01 

1  .14 

1.39 

2.02 

2.75 

4.01 

30 

1  .02 

1.13 

1.29 

1  .60 

2.30 

3.22 

4.60 

40 

1.13 

1.26 

1  .44 

1.78 

2.60 

3.59 

5.33 

60 

1.29 

1  .43 

1  .67 

2.04 

2.89 

4.10 

6.28 

120 

1  .49 

1.67 

1  .90 

2.29 

3.06 

4.10 

6.02 

0.006 

IP 

0 

321 

0.339 

0 

358 

0 

385 

0 

432 

0 

475 

0 

548 

20 

0 

474 

0.514 

0 

575 

0 

689 

0 

979 

1 

30 

1 

86 

24 

0 

517 

0.568 

0 

637 

0 

778 

1 

12 

1 

53 

2 

22 

30 

0 

574 

0.634 

0 

721 

0 

893 

1 

29 

1 

80 

2 

55 

40 

0 

637 

0.708 

0 

810 

0 

994 

1 

46 

2 

01 

2 

98 

60 

0 

723 

0.805 

0 

936 

1 

14 

1 

62 

2 

30 

3 

52 

120 

0 

837 

0.938 

1 

07 

1 

28 

1 

72 

2 

31 

3 

39 

0.004 

10 

0. 

143 

0. 

.151 

0. 

,159 

0. 

.171 

0. 

.191 

0 

.210 

0 

.241 

20 

0. 

211 

0. 

,228 

0. 

.255 

0. 

.304 

0. 

,431 

0 

.571 

0 

.814 

24 

0. 

229 

0. 

.252 

0, 

.282 

0. 

.343 

0. 

.496 

0 

.672 

0 

.977 

30 

0. 

255 

0. 

.281 

0, 

.319 

0. 

395 

0. 

.568 

0 

.792 

1 

.12 

40 

0. 

282 

0, 

.315 

0. 

.359 

0. 

,440 

0. 

.644 

0 

.890 

1 

.31 

60 

0. 

321 

0. 

.357 

0. 

.41  5 

0. 

506 

0. 

.719 

1 

.02 

1 

.56 

120 

0. 

372 

0, 

.41  7 

0, 

.475 

0. 

,571 

0. 

,764 

1 

.03 

1 

.51 

10 

0.0357 

0.0376 

0.0398 

0.0427 

0.0478 

0.0521 

0.0598 

20 

0.0526 

0.0568 

0.0635 

0.0756 

0.107 

0.141 

0.201 

24 

0.0572 

0.0629 

0.0703 

0.0853 

0.123 

0.166 

0.241 

30 

0.0636 

0.0702 

0.0797 

0.0983 

0.141 

0.196 

0.276 

40 

0.0706 

0.0785 

0.0894 

0.110 

0.160 

0.221 

0.325 

60 

0.0802 

0.0893 

0.104 

0.126 

0.179 

0.254 

0.389 

0.104 

0.143 

0.191 

0.378 

TABLE  5.3c.  Upper  Tail  Percentage  Points  for  the  Statistic 

Tn (c)  x  1000,  p  =  3 


PERCENT 


c 

n 

75 

80 

85 

90 

95 

97.5 

99 

-0.002 

10 

0.0358 

0.0376 

0.0398 

0.0426 

0.0476 

0.0514 

0.0585 

20 

0.0525 

0.0567 

0.0632 

0.0748 

0.105 

0.138 

0.196 

24 

0.0571 

0.0626 

0.0699 

0.0846 

0.121 

0.163 

0.235 

30 

0.0633 

0.0699 

0.0793 

0.0972 

0.139 

0.193 

0.271 

40 

0.0704 

0.0782 

0.0890 

0.109 

0.158 

0.217 

0.320 

60 

0.0802 

0.0892 

0.103 

0.126 

0.178 

0.253 

0.387 

120 

0.0934 

0.105 

0.119 

0.143 

0.191 

0.258 

0.380 

-0.004 

10 

0 

143 

0 

150 

0 

159 

0 

170 

0 

190 

0 

204 

0 

231 

20 

0 

210 

0 

227 

0 

252 

0 

298 

0 

416 

0 

546 

0 

771 

24 

0 

228 

0 

250 

0 

279 

0 

337 

0 

478 

0 

646 

0 

931 

30 

0 

253 

0 

279 

0 

316 

0 

387 

0 

535 

0 

764 

1 

08 

40 

0 

281 

0 

312 

0 

355 

0 

433 

0 

629 

0 

861 

1 

27 

60 

0 

320 

0 

357 

0 

413 

0 

501 

0 

710 

1 

01 

1 

54 

120 

0 

374 

0 

419 

0 

476 

0 

572 

0 

766 

1 

03 

1 

52 

-0.006 

10 

0 

322 

0 

338 

0 

359 

0 

384 

0 

426 

0 

.460 

0 

516 

20 

0 

472 

0 

510 

0 

566 

0 

664 

0 

930 

1 

21 

1 

71 

24 

0 

513 

0 

561 

0 

625 

0 

754 

1 

07 

1 

44 

2 

07 

30 

0 

569 

0 

627 

0 

709 

0 

868 

1 

24 

1 

70 

2 

40 

40 

0 

633 

0 

701 

0 

798 

0 

971 

1 

41 

1 

92 

2 

83 

60 

0 

721 

0 

801 

0 

927 

1 

12 

1 

59 

2 

25 

3 

44 

120 

0 

843 

0 

943 

1 

07 

1 

29 

1 

7  2 

2 

33 

3 

41 

-0.008 

10 

0. 

.57? 

0. 

,601 

0 

.636 

0 

.681 

0 

,  755 

0. 

814 

0. 

.914 

20 

0 

.838 

0 

.905 

1 

.00 

1 

.18 

1 

,64 

2. 

13 

3. 

,00 

24 

0 

.912 

0 

.995 

1 

.11 

1 

,33 

1 

,89 

2. 

,53 

3. 

.64 

30 

1 

.01 

1 

.11 

1 

.26 

1 

.54 

2 

.18 

3. 

.01 

4, 

.23 

40 

1 

.12 

1 

.24 

1 

.41 

1 

.72 

2 

,49 

3. 

,39 

4. 

.98 

60 

1 

.28 

1 

.42 

1 

.65 

1 

.99 

2 

.82 

3. 

,99 

6 

.08 

120 

1 

.  50 

1 

.68 

1 

.90 

2 

.29 

3 

.06 

4. 

,14 

6, 

.06 

I 

|| 
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TABLE  5.3d.  Upper  Tail  Percentage  Points  for  the  Statistic 

T i ( c )  x  1000,  p  =  3 


PERCENT 


95  97.5 


.211 

0.224 

0 

239 

0 

264 

0 

.283 

0 

.315 

i 

.316 

0.349 

0 

405 

0 

559 

0 

718 

1 

.00 

f 

.348 

0.385 

0 

459 

0 

647 

0 

.861 

1 

•  22 

I 

.389 

0.438 

0 

531 

0 

745 

1 

03 

1 

43 

* 

.436 

0.494 

0 

596 

0 

859 

1 

16 

1 

70 

.501 

0.576 

0 

699 

0 

980 

1 

39 

2 

11 

.591 

0.670 

0 

806 

1 

07 

1 

46 

2 

Id 

a 

0 

0. 

.358 

0. 

,37 

0 

0, 

.523 

0. 

.56 

4 

0. 

.569 

0. 

.61 

0 

0. 

.628 

0. 

.68 

0 

0. 

.701 

0, 

.77 

lO 

0, 

.802 

0. 

.89 

!0 

0. 

.944 

1 . 

.05 

• 

-0.025 

10 

0.561 

0.588 

0.623 

0.663 

0.729 

0.777 

0.851 

fc: 

20 

0.815 

0.874 

0.960 

1.11 

1  .48 

1  .89 

2.61 

m#w  | 

S 

24 

0.886 

0.964 

1.06 

1.25 

1.73 

2.28 

3.21 

30 

0.978 

1  .07 

1  .20 

1  .44 

2.00 

2.73 

3.79 

:  :K  I 

s 

40 

1  .09 

1  .21 

1.36 

1  .63 

2.32 

3.13 

4.51 

11 

A 

60 

1.25 

1.39 

1  .59 

1  .92 

2.67 

3.77 

5.68 

i 

E;: 

120 

1  .48 

1  .65 

1  .87 

2.24 

3.00 

4.07 

5.94 

i 
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TABLE  5.3e.  Upper  Tail  Percentage  Points  for  the  Statistic 

T-]  (c )  x  1000,  p  =  3 


PERCENT 


c 

n 

75 

80 

85 

90 

95 

97 

.5 

99 

0.3 

10 

9 

.53 

14.2 

21.7 

42.1 

107.0 

267.0 

793.0 

20 

2 

,46 

3 

.29 

4.51 

6.73 

12.3 

19.9 

35.4 

24 

2 

.24 

2 

.87 

3.76 

5.35 

8 

1.57 

13.8 

22.2 

30 

2 

.09 

2 

'.60 

3.29 

4.42 

6 

.99 

10.5 

16.0 

40 

1 

.93 

2 

'.29 

2.79 

3.32 

5 

i.13 

7 

.19 

10.8 

60 

1 

.83 

2 

M2 

2.50 

3.14 

4 

.32 

5 

i.75 

7.80 

120 

1 

.72 

1 

.92 

2.19 

2.60 

3 

.49 

4 

.43 

5.70 

0.2 

10 

0 

.600 

0. 

814 

1 

.35 

6 

.95 

23 

i.l 

38 

l.O 

82.0 

20 

0 

.763 

0. 

972 

1 

.27 

1 

.88 

3. 

50 

6. 

10 

11  .0 

24 

0 

.787 

0. 

973 

1 

.25 

1 

.80 

3. 

03 

4. 

97 

8.62 

30 

0 

.783 

0. 

935 

1 

.20 

1 

.65 

2. 

66 

4. 

04 

6.36 

40 

0 

.792 

0. 

934 

1 

.14 

1 

.46 

2. 

15 

3. 

10 

4.52 

60 

0 

.803 

0. 

933 

1 

.11 

1 

.38 

1  . 

95 

2. 

63 

3.64 

120 

0 

.81 1 

0. 

905 

1 

.04 

1 

.24 

1  . 

66 

2. 

07 

2.75 

10 

0 

.0959 

0. 

104 

0 

.117 

0 

.144 

0. 

207 

0. 

280 

0.407 

20 

0 

.145 

0. 

168 

0 

.206 

0 

.277 

0. 

454 

0. 

748 

1  .22 

24 

0 

.158 

0. 

181 

0 

.222 

0 

.300 

0. 

490 

0. 

785 

1  .38 

30 

0. 

.169 

0. 

194 

0 

.235 

0 

.314 

0. 

486 

0. 

727 

1  .25 

40 

0. 

.182 

0. 

208 

0 

.250 

0 

.314 

0. 

472 

0. 

682 

1  .03 

60 

0. 

.197 

0. 

225 

0 

.268 

0 

.336 

0. 

485 

0. 

665 

0.914 

120 

0. 

,216 

0. 

241 

0 

.274 

0 

.329 

0. 

438 

0. 

560 

0.767 

10 

0. 

.0940 

0. 

0991 

0 

.104 

0 

.111 

0. 

120 

0. 

129 

0.138 

20 

0, 

,130 

0. 

138 

0 

.149 

0 

.164 

0. 

195 

0. 

229 

0.287 

24 

0, 

.141 

0. 

151 

0 

.164 

0 

.183 

0. 

221 

0. 

280 

0.370 

30 

0, 

.157 

0. 

169 

0 

.185 

0 

.208 

0. 

258 

0. 

320 

0.447 

40 

0 

.173 

0. 

189 

0 

.209 

0 

.240 

0. 

303 

0. 

390 

0.510 

60 

0 

.200 

0. 

221 

0 

.248 

0 

.293 

0. 

388 

0. 

493 

0.644 

120 

0 

.243 

0. 

268 

0 

.302 

0 

.354 

0. 

477 

0. 

633 

0.839 
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TABLE  5.4a.  Upper  Tail  Percentage  Points  for  the  Statistic 

T-!  (c)  x  1000,  p  =  4 

PERCENT 


c 

n 

75 

80 

85 

90 

95 

97.5 

99 

0.01 

10 

0.148 

0.154 

0.160 

0.169 

0.181 

0.195 

0.213 

20 

0.237 

0.257 

0.287 

0.345 

0.461 

0.603 

0.817 

24 

0.260 

0.285 

0.324 

0.397 

0.541 

0.707 

0.977 

30 

0.291 

0.322 

0.367 

0.443 

0.599 

0.784 

1.10 

40 

0.325 

0.361 

0.414 

0.503 

0.704 

0.929 

1  .29 

60 

0.372 

0.412 

0.469 

0.555 

0.780 

1  .03 

1  .42 

120 

0.425 

0.467 

0.531 

0.630 

0.815 

1  .05 

1  .45 

0.008 

10 

0.0945 

0.0982 

0.103 

0.108 

0.116 

0.124 

0.135 

20 

0.151 

0.164 

0.182 

0.219 

0.291 

0.380 

0.514 

24 

0.166 

0.182 

0.206 

0.252 

0.342 

0.447 

0.614 

30 

0.186 

0.206 

0.234 

0.282 

0.380 

0.496 

0.697 

40 

0.208 

0.230 

0.264 

0.320 

0.448 

0.589 

0.818 

60 

0.238 

0.263 

0.300 

0.354 

0.498 

0.659 

0.908 

120 

0.272 

0.299 

0.340 

0.404 

0.521 

0.675 

0.930 

0.006 

10 

0.0531 

0.0552 

0.0576 

0.0607 

0.0649 

0.0695 

0.0757 

20 

0.0848 

0.0916 

0.102 

0.122 

0.162 

0.211 

0.284 

24 

0.0932 

0.102 

0.115 

0.140 

0.190 

0.248 

0.340 

30 

0.104 

0.115 

0.131 

0.157 

0.212 

0.276 

0.387 

40 

0.117 

0.129 

0.148 

0.179 

0.250 

0.329 

0.456 

60 

0.134 

0.148 

0.168 

0.199 

0.279 

0.369 

0.509 

120 

0.153 

0.168 

0.191 

0.227 

0.294 

0.380 

0.524 

0.005 

10 

0.0369 

0.0383 

0.0400 

0.0421 

0.0450 

0.0481 

0.0523 

20 

0.0588 

0.0635 

0.0706 

0.0847 

0.122 

0.146 

0.195 

24 

0.0646 

0.0705 

0.0797 

0.0969 

0.131 

0.172 

0.234 

30 

0.0724 

0.0798 

0.0906 

0.109 

0.147 

0.191 

0.267 

40 

0.0810 

0.0896 

0.102 

0.124 

0.173 

0.228 

0.315 

60 

0.929 

0.102 

0.117 

0.138 

0.194 

0.255 

0.353 

120 

0.106 

0.117 

0.132 

0.158 

0.204 

0.264 

0.364 
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TABLE  5.4b.  Upper  Tail  Percentage  Points  for  the  Statistic 

T i ( c )  x  1000,  p  =  4 


PERCENT 


95  97.5 


0, 

.236 

0. 

.245 

0. 

,256 

0. 

,270 

0. 

.376 

0. 

.406 

0. 

,450 

0. 

,539 

0, 

.413 

0. 

450 

0. 

,509 

0. 

,618 

0 

.463 

0, 

.510 

0. 

,579 

0, 

,696 

0. 

.518 

0. 

,573 

0. 

655 

0. 

,794 

0. 

.594 

0. 

.655 

0. 

745 

0. 

,881 

0, 

.681 

0, 

.747 

0. 

,848 

1 . 

.01 

0.003 


0.133 

0.211 

0.232 

0.260 

0.291 

0.334 

0.383 


0.138 

0.228 

0.253 

0.286 

0.322 

0.368 

0.420 


0.144 

0.253 

0.285 

0.325 

0.367 

0.419 

0.477 


0.390 

0.446 

0.494 

0.569 


.162 

.399 

.468 

0.524 

0.618 

0.695 

0.733 


0.17 

0.51 

0.60 

0.679 

0.814 

0.914 

0.954 


0.950 

1.12 

1.26 

1.31 


UPjl 
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TABLE  5.4c.  Upper  Tail  Percentage  Points  for  the  Statistic 

T-j(c)  x  1000,  p  =  4 


PERCENT 


c 

n 

75 

80 

85 

90 

95 

97.5 

99 

-0.001 

10 

0.0147 

0.0153 

0.0160 

0.0168 

0.0179 

0.0190 

0.0205 

20 

0.0233 

0.0252 

0.0280 

0.0330 

0.0432 

0.0560 

0.0739 

24 

0.0256 

0.0279 

0.0314 

0.0379 

0.0508 

0.0660 

0.0894 

30 

0.0288 

0.0316 

0.0357 

0.0428 

0.0572 

0.0740 

0.103 

40 

0.0322 

0.0356 

0.0406 

0.0490 

0.0677 

0.0891 

0.122 

60 

0.0370 

0.0407 

0.0462 

0.0545 

0.0768 

0.100 

0.139 

120 

0.0426 

0.0467 

0.0530 

0.0631 

0.0817 

0.106 

0.145 

-0.002 

10 

0.0589 

0.0613 

0.0638 

0.0672 

0.0716 

0.0760 

0.0818 

20 

0.0933 

0.101 

0.111 

0.131 

0.172 

0.222 

0.294 

24 

0.102 

0.111 

0.125 

0.151 

0.202 

0.262 

0.355 

30 

0.115 

0.126 

0.142 

0.171 

0.228 

0.294 

0.409 

40 

0.128 

0.142 

0.162 

0.196 

0.270 

0.355 

0.488 

60 

0.148 

0.163 

0.185 

0.218 

0.306 

0.401 

0.556 

120 

0.170 

0.187 

0.212 

0.252 

0.327 

0.425 

0.582 

-0.003 

10 

0 

133 

0 

138 

0 

144 

0 

151 

0 

161 

0 

171 

0 

184 

20 

0 

210 

0 

226 

0 

249 

0 

294 

0 

385 

0 

497 

0 

656 

24 

0 

230 

0 

250 

0 

280 

0 

339 

0 

453 

0 

587 

0 

792 

30 

0 

258 

0 

283 

0 

319 

0 

382 

0 

511 

0 

659 

0 

916 

40 

0 

289 

0 

319 

0 

364 

0 

438 

0 

606 

0 

795 

1 

09 

60 

0 

333 

0 

366 

0 

415 

0 

490 

0 

689 

0 

899 

1 

25 

120 

0 

383 

0 

421 

0 

477 

0 

568 

0 

737 

0 

956 

1 

31 

-0.004 

10 

0. 

,236 

0. 

,245 

0. 

255 

0. 

269 

0 

.286 

0. 

.303 

0. 

.326 

20 

0. 

.372 

0. 

.401 

0, 

,442 

0. 

.521 

0 

.682 

0 

.879 

1 

.16 

24 

0 

.408 

0 

.444 

0, 

.497 

0. 

,600 

0 

.801 

1 

.04 

1 

.40 

30 

0 

.458 

0, 

.502 

0. 

,567 

0, 

,678 

0 

.905 

1 

.16 

1 

.62 

40 

0 

.513 

0 

.566 

0. 

.650 

0, 

.777 

1 

.07 

1 

.41 

1 

.93 

60 

0. 

.592 

0. 

,650 

0. 

,737 

0. 

869 

1 

.22 

1 

.  59 

2 

.22 

120 

0, 

.682 

0, 

.748 

0. 

,848 

1  . 

.01 

1 

.31 

1  . 

.70 

2 

.33 
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TABLE  5.4d.  Upper  Tail  Percentage  Points  for  the  Statistic 

T] (c)  x  1000,  p  =  4 


PERCENT 


95  97.5 


0.0369  0.0383  0.0399  0.0419  0.0447  0.0473  0.0509 


0.0580 

0.0638 

0.0715 

0.0801 

0.0924 

0.107 


0.0625 

0.0692 

0.0783 

0.0883 

0.101 

0.117 


0.0689 

0.0775 

0.0833 

0.101 

0.115 

0.132 


0.0812 

0.0935 

0.106 

0.121 

0.135 

0.158 


0.106 

0.125 

0.141 

0.167 

0.191 

0.204 


0.137 

0.161 

0.181 

0.219 

0.248 

0.265 


0.180 

0.217 

0.252 

0.300 

0.345 

0.363 


0.0531  0.0552  0.0574  0.0604  0.0643  0.0680  0.0731 


0.0835  0.0899 

0.0917  0.0996 


0.103 

0.115 

0.133 

0.153 


0.0944 

0.148 

0.163 

0.182 

0.204 

0.236 

0.237 


0.113 

0.127 

0.146 

0.168 


0.0989 

0.111 

0.127 

0.145 

0.166 

0.191 


0.0980  0.102 


0.160 

0.176 

0.199 

0.225 

0.259 

0.299 


0.175 

0.197 

0.224 

0.256 

0.294 

0.340 


0.116 

0.134 

0.152 

0.174 

0.195 

0.227 


0.107 

0.206 

0.237 

0.268 

0.308 

0.345 

0.403 


0.152 

0.196 

0.257 

0.178 

0.230 

0.310 

0.202 

0.259 

0.360 

0.239 

0.314 

0.430 

0.274 

0.357 

0.496 

0.294 

0.382 

0.522 

0.114 

0.121 

0.129 

0.343 

0.315 

0.404 

0.356 

0.456 

0.633 

0.423 

0.553 

0.758 

0.486 

0.633 

0.878 

0.524 

0.680 

0.928 

'’JBIISh 

?jp&1 

%!$£&-  ‘ '( 1  ’ 

!|Rk 

mm 

6.43 

8.05 

10.5 

14.8 

25.0 

40.1 

5.27 

6.36 

7.95 

10.4 

15.7 

23.1 

4.54 

5.30 

6.32 

7.92 

11.0 

14.5 

3.86 

4.38 

5.09 

6.17 

8.21 

10.5 

3.53 

3.90 

4.35 

5.07 

6.37 

7.64 

1 

2 

i4 

2 

9 

2 

7 

1 

0.241 

0.268 

0.307 

0.356 

0.441 


.379 

0. 

.454 

0. 

,583  i 

1.404 

0, 

.485 

0. 

.639 

1.431 

0. 

.511 

0. 

.644  i 

1.441 

0 

.512 

0, 

.626 

1.463 

0, 

.524 

0. 

.613  i 

1.255 

0, 

.273 

0. 

,300 

1.285 

0. 

.307 

0. 

.343 

1.330 

0 

.358 

0, 

.404 

1.384 

0 

.424 

0 

.478 

1.478 

0. 

.526 

0 

.613 

0.344 

0.416 

0.505 

0.603 

0.795 


.37 

2.29 

• 

■  m 

.44 

2.19 

.36 

2.04 

.19 

1.72 

.02 

1.32 

0.409 
0.506 
0.636 
0.763 
1  .02 


153 


i 

k 
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TABLE  5.5a. 


Upper  Tail  Percentage  Points  for  the  Statistic 
T-j  (c)  x  1000,  p  =  5 


PERCENT 


c  n 

75 

80 

85 

90 

95 

97.5 

99 

0.004  24 

0.664 

0.723 

0.806 

0.937 

1.22 

1.50 

1.88 

30 

0.750 

0.820 

0.920 

1.08 

1.39 

1.73 

2.27 

40 

0.851 

0.937 

1.05 

1.23 

1.64 

2.07 

2.67 

60 

0.970 

1  .06 

1.19 

1  .41 

1  .83 

2.33 

3.10 

120 

1.14 

1  .24 

1.38 

1.61 

2.06 

2.54 

3.23 

0.0035  24 

0.508 

0.553 

0.616 

0.715 

0.931 

1  .14 

1  .44 

30 

0.573 

0.627 

0.703 

0.829 

1  .06 

1  .32 

1.73 

40 

0.651 

0.717 

0.806 

0.941 

1.25 

1.58 

2.04 

60 

0.742 

0.810 

0.912 

1  .08 

1  .40 

1  .78 

2.37 

120 

0.870 

0.947 

1  .06 

1.23 

1.57 

1  .95 

2.47 

0.003  24 

0.373 

0.406 

0.452 

0.524 

0.682 

0.838 

1.05 

30 

0.421 

0.460 

0,516 

0.608 

0.778 

0.966 

1.27 

40 

0.478 

0.526 

0.592 

0.691 

0.917 

1  .16 

1  .49 

60 

0.545 

0.595 

0.669 

0.722 

1  .02 

1  .30 

1  .74 

120 

0.639 

0.696 

0.778 

0.905 

1.16 

1.43 

1  .81 

0.0025  24 

0.259 

0.281 

0.363 

0.472 

0.580 

30 

0.292 

0.319 

0.358 

0.421 

0.539 

0.669 

0.876 

40 

0.331 

0.365 

0.410 

0.479 

0.635 

0.884 

1  .03 

60 

0.378 

0.413 

0.464 

0.550 

0.710 

0.905 

1  .20 

120 

0.444 

0.483 

0.541 

0.628 

0.802 

0.993 

1  .26 

0.001 


.413 

0.448 

.465 

0.508 

.529 

0.582 

.605 

0.659 

.709 

0.774 

.103 

0. 

112 

.116 

0. 

127 

.132 

0. 

145 

.151 

0. 

165 

.  177 

0. 

194 

0.498 

0.570 

0.653 

0.741 

0.864 


0.577 
0.670 
0.763 
0.876 
1  .01 


0.748 
0.856 
1  .01 
1.13 
1  .28 


TABLE.  5.5c.  Upper  Tail  Percentage  Points  for  the  Statistic 

T-j  (c)  x  10,000,  p  =  5 

PERCENT 


0.0005 


-0.002 


4 

1 

0 

1 

0 

2 

0 

2 

0 

2 

0. 

,124 

0. 

,143 

0. 

,142 

0. 

,166 

0. 

163 

0. 

,190 

0. 

,185 

0, 

,218 

0. 

,216 

0. 

,251 

.4 

1 

5 

2 

0 

2 

.186 

0.227 

0.283 

5 

.212 

0.263 

0.344 

i 

.251 

0.317 

0.406 

5 

.282 

0.359 

0.476 

» 

.320 

0.396 

0.502 

7 

t* 

c 

.411 

0.446 

0.496 

0.572  C 

.464 

0.506 

0.567 

0.664  C 

.527 

0.580 

0.650 

0.759  1 

.603 

0.657 

0.738 

0.872 

.709 

0.774 

0.863 

1.01 

TABLE  5.5d.  Upper  Tail  Percentage  Points  for  the  Statistic 

T1(c)  x  1000,  p  =  5 

PERCENT 


0.0025 


.277 

0, 

.308 

0, 

,356 

.315 

0. 

,353 

0. 

,412 

.361 

0. 

,404 

0, 

.472 

.410 

0. 

,460 

0, 

,544 

.484 

0, 

,539 

0, 

.627 

0.802 
0.932 
1  .13 
1  .28 
1  .43 


09 

1  .3 

27 

1  .6 

53 

1  .9 

74 

2.3 

94 

2.4 

TABLE  5.5e.  Upper  Tail  Percentage  Points  for  the  Statistic 

T-)  (c) ,  p  =  5 


PERCENT 


c 

n 

75 

80 

85 

90 

95 

97.5 

99 

0.3 

24 

18.3 

23.2 

31  .6 

47.7 

89.8 

167.0 

317.0 

30 

12.2 

14.8 

18.3 

24.3 

40.3 

64.1 

120.0 

40 

9.03 

10.5 

12.4 

15.4 

21  .7 

28.7 

40.7 

60 

7.24 

8.16 

9.40 

11.2 

14.7 

18.4 

24.0 

120 

6.03 

6.59 

7.36 

8.51 

10.4 

12.5 

15.2 

0.2 

24 

30 

40 

60 

120 

4.18 

3.68 

3.32 

3.03 

2.74 

5.19 

4.39 

3.80 

3.38 

2.99 

6.70 

5.40 

4.50 

3.90 

3.33 

9.02 

7.04 

5.63 

4.68 

3.84 

14.9 

10.5 

7.71 

6.32 

4.78 

22.3 

14.3 

10.3 
7.92 
5.73 

35.0 

22.3 

14.2 

10.2 
6.94 

0.1 

24 

0.585 

0.676 

0.814 

1  .07 

1  .60 

2.43 

4.18 

30 

0.629 

0.728 

0.869 

1.12 

1.58 

2.28 

3.47 

40 

0.660 

0.750 

0.878 

1.10 

1.56 

2.12 

3.25 

60 

0.678 

0.765 

0.876 

1.07 

1  .46 

1  .88 

2.57 

120 

0.696 

0.762 

0.855 

0.995 

1.26 

1.53 

1  .96 

-0.1 

24 

0 

371 

0 

389 

0 

412 

0 

443 

0 

506 

0 

572 

0 

665 

30 

0 

418 

0 

441 

0 

469 

0 

511 

0 

594 

0 

686 

0 

823 

40 

0 

475 

0 

502 

0 

541 

0 

606 

0 

726 

0 

873 

1 

08 

60 

0 

563 

0 

602 

0 

656 

0 

736 

0 

896 

1 

1 1 

1 

38 

120 

0 

702 

0 

757 

0 

829 

0 

941 

1 

18 

1 

45 

1 

88 

^  w  w 
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TABLE  5.6b.  Upper  Tail  Percentage  Points  for  the  Statistic 

T -j  ( c )  x  10,000,  p  =  6 


PERCENT 


c 

n 

75 

80 

85 

90 

95 

97.5 

99 

0.002 

24 

2.43 

2.63 

2.90 

3.31 

4.04 

4.87 

6.03 

30 

2.78 

3.02 

3.36 

3.87 

4.80 

5.91 

7.67 

40 

3.21 

3.49 

3.91 

4.54 

5.78 

6.98 

8.97 

60 

3.69 

4.05 

4.51 

5.21 

6.65 

8.19 

10.5 

120 

4.30 

4.64 

5.11 

5.88 

7.32 

8.86 

11.5 

0.0015 

24 

1  .37 

1  .48 

1  .63 

1  .86 

2.27 

2.72 

3.37 

30 

1.56 

1.70 

1  .89 

2.18 

2.69 

3.31 

4.30 

40 

1  .80 

1  .96 

2.20 

2.55 

3.24 

3.91 

5.03 

60 

2.08 

2.28 

2.53 

2.92 

3.73 

4.60 

5.92 

120 

2.42 

2.61 

2.87 

3.31 

4.12 

4.98 

6.47 

0.001 

24 

0.607 

0.655 

0.723 

0.823 

1  .00 

1  .21 

1  .49 

30 

0.693 

0.753 

0.838 

0.964 

1.19 

1  .47 

1  .90 

40 

0.799 

0.869 

0.974 

1.13 

1.44 

1.74 

2.23 

60 

0.922 

1  .01 

1.12 

1  .30 

1  .66 

2.04 

2.63 

120 

1  .07 

1.16 

1.28 

1  .47 

1  .83 

2.21 

2.88 

0.152 

.173 

.200 

.230 

.269 


0.163 

0.188 

0.217 

0.253 

0.290 


0.180 

0.209 

0.243 

0.281 

0.319 


0.205 

0.240 

0.282 

0.324 

0.367 


0.250 
0.297 


0.301 


0.371 


/L  /  . f  V  V  V  ’«*  V  V  V  /  V  *  .*  V  \ 


c 

n 

75 

80 

85 

90 

95 

97 

'.5 

99 

-0. 

.0005 

24 

0 

.151 

0, 

.163 

0, 

.180 

0 

.204 

0, 

.248 

0. 

299 

0. 

368 

30 

0 

.172 

0. 

.187 

0, 

.208 

0 

.239 

0. 

.296 

0. 

362 

0. 

470 

40 

0 

.199 

0 

.217 

0, 

.243 

0 

.281 

0, 

.356 

0. 

430 

0. 

551 

60 

0 

.230 

0. 

.252 

0, 

.280 

0 

.323 

0, 

.412 

0. 

508 

0. 

653 

120 

0 

.269 

0 

.290 

0. 

.319 

0 

.367 

0, 

.458 

0. 

551 

0. 

720 

-0. 

.001 

24 

0 

.604 

0 

.650 

0, 

,716 

0 

.815 

0 

.991 

1 . 

19 

1 . 

46 

30 

0 

.689 

0 

.749 

0, 

.831 

0 

.955 

1 

.18 

1 . 

44 

1 . 

87 

40 

0, 

.796 

0 

.866 

0. 

,969 

1 

.12 

1 

.42 

1 . 

71 

2. 

20 

60 

0 

.918 

1 

.01 

1  , 

.12 

1 

.29 

1 

.64 

2. 

03 

2. 

61 

120 

1 

.07 

1 

.16 

1  , 

,27 

1 

.47 

1 

.83 

2. 

20 

2. 

88 

-0 

.0015 

24 

1 

.36 

1 

.46 

1 

.61 

1 

.83 

2 

.22 

2. 

67 

3. 

28 

30 

1 

.55 

1 

.68 

1 

.86 

2 

.14 

2 

.64 

3. 

24 

4. 

20 

40 

1 

.79 

1 

.95 

2 

.18 

2 

.52 

3 

.19 

3. 

85 

4. 

93 

60 

2 

.06 

2 

.26 

2 

.52 

2 

.91 

3 

.69 

4. 

56 

5. 

86 

120 

2 

.42 

2 

.61 

2 

.87 

3 

.30 

4 

.11 

4. 

96 

6. 

47 

-0. 

.002 

24 

2 

.41 

2. 

,59 

2. 

.85 

3. 

.24 

3. 

.94 

4 

73 

5. 

80 

30 

2 

.75 

2 

.98 

3. 

.31 

3, 

.80 

4. 

.69 

5 

74 

7. 

43 

40 

3 

.18 

3, 

,46 

3. 

.86 

4  . 

,47 

5. 

.66 

6 

82 

8. 

74 

60 

3 

.66 

4 

.02 

4  , 

.47 

5. 

.16 

6 

.56 

8 

09 

1C 

1.4 

120 

4 

.30 

4  , 

,63 

5, 

,10 

5, 

.86 

7 

.31 

8 

82 

11 

.  5 

161 


TABIE  5.6d.  Upper  Tail  Percentage  Points  for  the  Statistic 

T 1 ( c )  x  1000,  p  =  6 


PERCENT 


c  n 

75 

80 

85 

90 

95 

97.5 

99 

-0.0025  24 

0.376 

0.404 

0.445 

0.505 

0.614 

0.736 

0.903 

30 

0.429 

0.465 

0.516 

0.593 

0.731 

0.894 

1.16 

40 

0.496 

0.539 

0.603 

0.697 

0.883 

1  .03 

1  .36 

60 

0.572 

0.627 

0.698 

0.805 

1  .02 

1  .26 

1  .62 

120 

0.671 

0.720 

0.797 

0.916 

1.14 

1  .38 

1  .80 

-0.003 

24 

0 

540 

0 

581 

0 

640 

0 

726 

0 

882 

1  .06 

1 

26 

30 

0 

617 

0 

670 

0 

742 

0 

852 

1 

.05 

1  .28 

1 

66 

40 

0 

713 

0 

776 

0 

866 

1 

00 

1 

27 

1  .52 

1 

95 

60 

0 

824 

0 

903 

1 

00 

1 

16 

1 

.47 

1  .81 

2 

33 

120 

0 

966 

1 

04 

1 

15 

1 

32 

1 

.64 

1  .98 

2 

59 

-0.0035  24 

0 

.734 

0 

.789 

0 

.869 

0 

.985 

1 

.20 

1  . 

.43 

1  . 

.75 

30 

0. 

.838 

0 

.910 

1 

.01 

1 

.16 

1 

.43 

1 

.73 

2, 

.25 

40 

0 

.970 

1 

.05 

1 

.18 

1 

.36 

1 

.72 

2 

.07 

2. 

.65 

60 

1 

.12 

1 

.23 

1 

.37 

1 

.57 

2 

.00 

2 

.46 

3. 

.16 

120 

1 

.31 

1 

.42 

1 

.56 

1 

.79 

2 

.23 

2 

.69 

3 

52 

24 

0 

.958 

1  .03 

1 

.13 

1 

.28 

1 

.56 

1  . 

.86 

2 

.28 

30 

1 

.09 

1.19 

1 

.31 

1 

.51 

1 

.86 

2. 

,  27 

2 

.93 

40 

1 

.27 

1  .38 

1  . 

.53 

1 

.77 

2, 

.24 

2. 

,70 

3 

.45 

60 

1 

.46 

1  .60 

1 

.78 

2 

.05 

2 

.61 

3. 

.21 

4 

.12 

1  20 

1 

.  7  2 

1  .85 

2 

.03 

2 

.34 

2 

.92 

3. 

.  52 

4 

.60 

162 
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TABLE  5.6e.  Upper  Tail  Percentage  Points  for  the  Statistic 

T] (c)  x  1000,  p  =  6 


75  80 


PERCENT 


95  97.5 


'km  - . 

0.2  24 

8.67 

11.1 

14.3  ; 

30 

6.97 

8.11 

9.87 

■ 

40 

5.85 

6.69 

7.79  < 

m 

60 

5.00 

5.55 

6.31 

120 

4.39 

4.76 

5.21  ! 

.02  1 

.17 

1  .41 

.07  1 

.21  1 

1  .44 

.09  1 

.23  1 

1  .42 

.08  i 

.21  1 

!  .39 

.10  1 

.19  1 

1.32 

0.187 

0.209 


0.208 

0.232 


0.230  0.255 
0.247  0.275 
0.273  0.298 


0.534  0.558 
0.606  0.635 
0.689  0.730 
0.820  0.873 


0.238 

0.264 

0.289 

0.311 

0.329 


0.585 

0.669 

0.783 


34.0 

52.6 

91  .0 

18.7 

26.1 

39.1 

12.9 

16.7 

23.1 

9.43 

1 1  .4 

14.4 

7.06 

8.11 

9.55 

85 

2.91 

4.31 

7.17 

79 

2.64 

3.76 

5.62 

76 

2.40 

3.18 

4.68 

65 

2.13 

2.65 

3.45 

52 

1  .83 

2.16 

2.59 

0.286 

0.315 

0.346 

0.371 

0.376 


0.626 

0.719 

0.855 


0.387 

0.429 

0.457 

0.469 

0.458 


0.697 

0.826 

1.01 


0.508 

0.564 

0.584 

0.584 

0.547 


0.691 

0.766 

0.812 

0.766 

0.696 


0.783  0.912 

0.951  1.12 

1.18  1.43 


TBBOStiKkBt  '&X3&HB&  a® k*M< 


0.183 

0.195 

0.210 

.231 

0.269 

0.304 

! 

0.216 

0.232 

0.253 

0 

.283 

0.338 

0.397 

• 

0.252 

0.271 

0.295 

0 

.334 

0.402 

0.482 

1 

M 

0.293 

0.318 

0.346 

0 

.391 

0.471 

0.561 

0.339 

0.362 

0.394 

.436 

0.524 

0.618 

! 

0.0035 


0.003 


0.140 

0.165 

0.192 

0.224 

0.260 


0.103 

0.121 

0.141 

0.164 


0.149 

0.177 

0.207 


0.161 

0.193 

0.225 


0.243  0.265 

0.277  0.301 


0.109 

0.130 


0.118 

0.141 


0.152  0.165 

0.178  0.194 


0.177 

0.216 

0.255 

0.299 

0.334 


0.129 
0.158 
0.  187 
0.219 


0.205 

0.258 

0.307 

0.360 

0.401 


0.150 

0.189 

0.225 

0.264 

0.294 


0.232 

0.303 

0.368 

0.429 

0.473 


0.170 

0.222 

0.269 

0.314 

0.347 


0.2 
0.2 
0.3 
0.387 
0.426 


a 
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TABLE  5.7b.  Upper  Tail  Percentage  Points  for  the  Statistic 

T -j  ( c )  x  1000,  p  =  8 


TABLE  5.7 c.  Upper  Tail  Percentage  Points  for  the  Statistic 

T i ( c )  x  1000,  p  =  8 

PERCENT 


c 

n 

75 

80 

85 

90 

95 

97.5 

99 

-0.0005 

24 

0.0281 

0.0298 

0.0321 

0.0352 

0.0408 

0.0458 

0.0537 

30 

0.0331 

0.0355 

0.0386 

0.0431 

0.0511 

0.0598 

0.0709 

40 

0.0387 

0.0415 

0.0451 

0.0509 

0.0611 

0.0728 

0.0879 

60 

0.0452 

0.0489 

0.0534 

0.0601 

0.0722 

0.0859 

0.105 

120 

0.0528 

0.0564 

0.0612 

0.0677 

0.0814 

0.0961 

0.177 

-0.001 

24 

0.112 

0.119 

0.128 

0.140 

0.163 

0.182 

0.214 

30 

0.132 

0.142 

0.154 

0.172 

0.204 

0.238 

0.282 

40 

0.155 

0.166 

0.180 

0.203 

0.244 

0.290 

0.350 

60 

0.181 

0.195 

0.213 

0.240 

0.288 

0.343 

0.420 

120 

0.211 

0.226 

0.245 

0.271 

0.326 

0.384 

0.468 

-0.0015 

24 

0.252 

0.267 

0.287 

0.314 

0.364 

0.409 

0.478 

30 

0.297 

0.318 

0.346 

0.386 

0.456 

0.534 

0.032 

40 

0.347 

0.372 

0.404 

0.456 

0.546 

0.651 

0.783 

60 

0.406 

0.439 

0.479 

0.539 

0.647 

0.769 

0.941 

120 

0.475 

0.507 

0.551 

0.608 

0.732 

0.864 

1  .05 

-0.002 

24 

0.446 

0.473 

0.509 

0.557 

0.645 

0.724 

0.846 

30 

0.527 

0.564 

0.613 

0.684 

0.808 

0.946 

1.12 

40 

0.616 

0.661 

0.717 

0.810 

0.969 

1.16 

1.39 

60 

0.721 

0.779 

0.850 

0.957 

1.15 

1  .36 

1  .67 

120 

0.843 

0.901 

0.978 

1  .08 

1  .30 

1  .54 

1  .87 

-0.0025 


-0.003 


-0.0035 


0.0696 

0.0822 

0.0961 

0.112 

0.132 


0.100 

0.118 

0.138 

0.162 

0.190 


0.136 

0.160 

0.188 

0.220 

0.258 


0.0738  0.0793 

0.0879  0.0956 


0.103 

0.121 

0.141 


0.106 

0.126 

0.148 

0.175 

0.203 


0.144 

0.171 

0.201 

0.237 

0.276 


0.112 

0.133 

0.153 


0.114 

0.137 

0.161 

0.191 

0.220 


0.155 

0.186 

0.218 

0.259 

0.299 


0.0868 

0.106 

0.126 

0.149 

0.169 


0.125 

0.153 

0.181 

0.215 

0.243 


0.169 

0.207 

0.246 

0.292 

0.330 


0.101 

0.126 

0.151 

0.179 

0.203 


0.144 

0.181 

0.217 

0.257 

0.293 


0.196 

0.245 

0.294 

0.349 

0.398 


0.113 

0.147 

0.180 

0.213 

0.240 


0.162 

0.211 

0.258 

0.306 

0.345 


0.219 

0.287 

0.351 

0.415 

0.469 


0.132 
0.174 
0.216 
0 . 2b0 
0.293 


0.189 

0.249 

0.309 

0.373 

0.421 


0.255 

0.338 

0.420 

0.506 

0.574 


f 

K* 


0.1 

24 

2.62 

3.09 

3.73 

4.93 

8.01 

12.4 

>3.2 

30 

2.60 

2.98 

3.53 

4.47 

6.64 

9.85 

5.6 

40 

2.58 

2088 

3.34 

4.02 

5.47 

7.29 

0.5 

60 

2.44 

2.69 

2.99 

3.48 

4.36 

5.37 

3.75 

120 

2.29 

2.45 

2.66 

2.96 

3.50 

4.01 

.  90 

0.075 

24 

1  .07 

1.21 

1.39 

1.71 

2.42 

3.29 

.  20 

30 

1.16 

1.30 

1.51 

1.83 

2.56 

3.57 

5.37 

40 

1  .25 

1  .38 

1.58 

1  .88 

2.50 

3.28 

1.74 

60 

1.27 

1  .39 

1.55 

1  .81 

2.25 

2.79 

3.57 

120 

1.27 

1.36 

1  .48 

1  .64 

1  .97 

2.24 

>.82 

-0.05 

24 

0.243 

0.252 

0.263 

0.281 

0.311 

0.338 

1.373 

30 

0.282 

0.296 

0.313 

0.339 

0.383 

0.427 

1.484 

40 

0.338 

0.357 

0.380 

0.415 

0.475 

0.542 

1.637 

60 

0.405 

0.433 

0.464 

0.512 

0.596 

0.685 

1.812 

120 

0.511 

0.542 

0.585 

0.648 

0.755 

0.870 

.06 

-0.1 

24 

0.901 

0.923 

0.955 

0.993 

1  .07 

1  .12 

.18 

30 

1  .04 

1  .07 

1  .11 

1.18 

1.27 

1.35 

.47 

40 

1  .23 

1  .28 

1  .34 

1  .43 

1.57 

1  .71 

.88 

60 

1  .50 

1  .57 

1  .67 

1  .80 

2.01 

2.23 

3.14 

120 

1.97 

2.07 

2.22 

2.42 

2.76 

3.15 

3.65 

*0% 


N>> 


>w 


W«5 
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TABLE  5.8a.  Upper  Tail  Percentage  Points  for  the  Statistic 

T-,  ( c )  x  100,  p  =  10 

PERCENT 


0.004 


0.0035 


0.288 

0.347 

0.421 

0.504 

0.597 


0.220 

0.265 

0.322 

0.386 

0.457 


.303 

0.323 

.368 

0.395 

.447 

0.483 

.533 

0.578 

.635 

0.683 

.232 

0.246 

.281 

0.302 

.342 

0.369 

.408 

0.442 

.486 

0.522 

.170 

0.180 

.206 

0.221 

.250 

0.270 

.299 

0.324 

.357 

0.384 

.346 

.438 

.535 

0.639 

0.747 


0.264 

0.334 

0.408 

0.489 

0.572 


0.386 

0.506 

0.623 

0.756 

0.861 


0.294 

0.386 

0.475 

0.577 

0.658 


.  i  i 

0.882  1.04 

0.987  1.16 


.320 
.431 
0.546 
0.673 
0.755 


.359 
.509 
0.657 
0.795 
0.890 
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TABLE  5.8b.  Upper  Tail  Percentage  Points  for  the  Statistic 

T -|  ( c )  x  1000,  p  =  10 

PERCENT 


c 

n 

75 

80 

85 

90 

95 

97.5 

99 

0.002 

24 

0.711 

0.749 

0.796 

0.853 

0.950 

1  .03 

1.16 

30 

0.859 

0.909 

0.976 

1.08 

1  .24 

1  .39 

1.63 

40 

1  .04 

1  .11 

1.19 

1.32 

1  .54 

1  .76 

2.11 

60 

1.25 

1  .32 

1  .43 

1  .59 

1  .87 

2.18 

2.57 

120 

1.49 

1.58 

1.70 

1  .86 

2.14 

2.46 

2.89 

0.0015 

24 

0.399 

0.420 

0.446 

0.478 

0.532 

0.578 

0.647 

30 

0.482 

0.510 

0.547 

0.605 

0.696 

0.777 

0.910 

40 

0.585 

0.621 

0.670 

0.741 

0.861 

0.985 

1.18 

60 

0.703 

0.743 

0.805 

0.890 

1  .05 

1.22 

1  .44 

120 

0.837 

0.890 

0.956 

1  .05 

1.20 

1  .38 

1  .62 

0.001 

24 

0.177 

0.186 

0.198 

0.212 

0.235 

0.256 

0.286 

30 

0.214 

0.226 

0.243 

0.268 

0.308 

0.344 

0.402 

40 

0.260 

0.275 

0.297 

0.328 

0.381 

0.436 

0.522 

60 

0.312 

0.329 

0.357 

0.395 

0.464 

0.540 

0.637 

120 

0.372 

0.395 

0.425 

0.464 

0.534 

0.613 

0.721 

0.0005 

24 

0.0441 

0.0464 

0.0493 

0.0528 

0.0586 

0.0637 

0.0713 

30 

0.0533 

0.0564 

0.0605 

0.0667 

0.0766 

0.0856 

0.0999 

40 

0.0647 

0.0686 

0.0740 

0.0819 

0.0949 

0.109 

1.130 

60 

0.0778 

0.0822 

0.0891 

0.0984 

0.116 

0.135 

0.159 

120 

0.0928 

0.0988 

0.106 

0.116 

0.133 

0.153 

0.180 

r*WW 


170 


-0.001  2 
3 


0.0439 

0.0530 

0.0646 

0.0775 

0.0927 


0.175 

0.212 

0.257 

0.310 

0.371 


0.0461 

0.0560 

0.0682 

0.0819 

0.0986 


1.0490 

0.0524 

0.0582 

1.0601 

0.0663 

0.0760 

L0735 

0.0814 

0.0943 

1.0888 

0.0980 

0.115 

1.106 

0.116 

0.133 

M95 

0.209 

0.232 

1.240 

0.264 

0.303 

'.293 

0.324 

0.376 

.354 

0.391 

0.459 

1.423 

0.463 

0.532 

0.0633 

0.0848 

0.108 

0.134 

0.153 


0.0706 

0.0987 

0.129 

0.157 

0.180 


0.0015  24 

30 


0.393 

0.413 

0.438 

0.468 

0.475 

0.502 

0.538 

0.593 

0.577 

0.611 

0.658 

0.728 

0.734 

0.886 


1.519 

0. 

.564 

1.678 

0. 

.756 

1.842 

0. 

.959 

.03 

1 . 

.20 

.20 

1 . 

.37 
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TABLE  5.8d.  Upper  Tail  Percentage  Points  for  the  Statistic 

T -,  ( c )  x  1000,  p  =  10 

PERCENT 


-0.0025 


-0.003 


-0.0035 


-0.004 


n 

75 

80 

85 

90 

95 

97.5 

24 

0 

109 

0 

114 

0 

121 

0 

129 

0 

143 

0 

156 

30 

0 

131 

0 

139 

0 

149 

0 

164 

0 

187 

0 

208 

40 

0 

160 

0 

169 

0 

182 

0 

201 

0 

232 

0 

264 

60 

0 

193 

0 

203 

0 

220 

0 

243 

0 

284 

0 

330 

120 

0 

231 

0 

246 

0 

264 

0 

288 

0 

331 

0 

380 

24 

0 

156 

0 

164 

0 

174 

0 

185 

0 

205 

0 

223 

30 

0 

189 

0 

199 

0 

213 

0 

235 

0 

268 

0 

298 

40 

0 

229 

0 

243 

0 

261 

0 

288 

0 

333 

0 

379 

60 

0 

277 

0 

292 

0 

316 

0 

348 

0 

409 

0 

474 

120 

0 

333 

0 

354 

0 

380 

0 

415 

0 

477 

c 

547 

24 

0 

212 

0 

222 

0 

236 

0 

251 

0 

278 

0 

303 

30 

0 

256 

0 

270 

0 

289 

0 

318 

0 

363 

0 

404 

40 

0 

312 

0 

330 

0 

354 

0 

391 

0 

452 

0 

514 

60 

0 

376 

0 

397 

0 

430 

0 

473 

0 

556 

0 

644 

120 

0 

453 

0 

481 

0 

516 

0 

564 

0 

648 

0 

744 

24 

0 

276 

0 

290 

0 

307 

0 

327 

0 

362 

0 

394 

30 

0 

334 

0 

352 

0 

377 

0 

414 

0 

472 

0 

525 

40 

0 

406 

0 

429 

0 

462 

0 

509 

0 

588 

0 

669 

60 

0 

491 

0 

517 

0 

560 

0 

616 

0 

724 

0 

838 

1  20 

0 

591 

0 

628 

0 

674 

0 

736 

0 

846 

0 

970 

0.453 

0.557 

0.643 


W5S 


VVa-a-./.-.-.n- 
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TABLE  5.8e.  Upper  Tail  Percentage  Points  for  the  Statistic 

T  i  ( c ) ,  p  =  10 

PERCENT 


n 

75 

80 

85 

90 

95 

97.5 

99 

24 

6.48 

7.90 

10.1 

14.7 

26.1 

42.9 

66.1 

30 

6.07 

7.07 

8.53 

11.3 

17.2 

24.5 

36.2 

40 

5.35 

6.01 

6.93 

8.46 

11.6 

15.2 

21  .2 

60 

4.72 

5.14 

5.72 

6.55 

8.01 

9.93 

12.5 

120 

4.20 

4.45 

4.79 

5.24 

6.08 

6.85 

8.04 

24 

2.10 

2.38 

2.80 

3.51 

5.08 

7.76 

13.7 

30 

2.34 

2.61 

3.03 

3.78 

5.31 

7.51 

12.1 

40 

2.40 

2.65 

3.02 

3.62 

4.80 

6.35 

8.94 

60 

2.38 

2.58 

2.86 

3.26 

3.97 

4.98 

6.20 

120 

2.30 

2.45 

2.62 

2.88 

3.34 

3.80 

4.52 

24 

0.367 

0.377 

0.391 

0.413 

0.445 

0.476 

0.509 

30 

0.438 

0.456 

0.477 

0.507 

0.559 

0.605 

0.662 

40 

0.534 

0.559 

0.589 

0.635 

0.707 

0.788 

0.883 

60 

0.668 

0.700 

0.741 

0.789 

0.905 

1  .03 

1  .20 

120 

0.861 

0.906 

0.959 

1  .06 

1.20 

1.33 

1.59 

A 

*T 

1  1 Q 

1  .  sJ 

1  . 35 

1  .3 

0 

1.56 

1.59 

1.6 

0 

1  .89 

1  .95 

2.0 

0 

2.39 

2.48 

2.5 

0 

3.23 

3.38 

3.5 

1 

.55 

1 

1  . 

.89 

1 

2 

.38 

2 

3 

.17 

3 

5.3  A  Power  Study 

A  power  study  was  performed  to  determine  how  well  T  (c)  could 
detect  non-normality.  The  following  definitions  of  non-Gaussian 
multivariate  distributions  were  used  in  the  power  study.  Suppose 
(x.|,  x^,  ....  xp)T  ~  N(0,R)  where  R  is  a  positive-definite 
correlation  matrix. 

r 

(1 )  If  )/j  =  x^,  i  =  1 ,  2,  ....  p.  Then  (y] ,  y2 . yp)T 

j=l 

follows  a  p-variate  chi-square  distribution  with  r  degrees  of 
freedom. 

(2)  If  w  has  a  chi-square  distribution  with  r  degrees  of  freedom, 
w  is  independent  of  the  x.'s,  and  t.  =  x 4J w/ r ,  then 

( t ^ ,  t?,  ....  tp)T  follows  a  p-variate,  t-distribution 
on  r  degrees  of  freedom. 

(3)  If  yi  =  expLx.]  i  =  1,  2,  ....  p  then  (y] ,  y2 . yp)T 

follows  a  p-variate  lognormal  distribution. 

(4)  If,  in  a  sample  of  n  p-variate  vectors,  X  =  n^/n  is  the 

fraction  from  Np(m^,0^)  and  (1  -  x)  is  the  fraction  from 
N  (m_,0_),  then  (x1f  x„,  ....  x  )  is  from  the  normal 

p  2  2  12  p 

mixture  XN  (m1fD.)  +  (1  -  X)N  (m„,D„). 

p  I  I  p  2  2 


For  the  analysis,  the  matrix  R  was  the  p-variate  identity  ‘for 
p  -  1,  2,  or  5.  Chi-square  distributions  with  2,  4,  6,  10,  and  14 


degrees  of  freedom  were  examined;  t-distributions  with  3,  5,  7,  and  9 
degrees  of  freedom  were  examined.  Four  Gaussian  mixture  distributions 
were  analyzed.  For  p  =  1,  the  parameters  defining  the  mixture 
distributions  were  (1)  =  0,  m2  =  2,  =  0^  =  1,  and 

X  =  0.75;  (2)  =  0,  m2  =  2,  0^=0^  =  1,  and  X  =  0.5;  (3)  m^  =  = 

D.|  =1,  D2  =  4,  and  x  =  0.5;  and  (4)  =  0,  m?  =  1 ,  D1  =  02  =  1 , 

and  x  =  0.75.  For  p  =  2,  the  parameters  defining  the  mixture 
distributions  were 


(1) 

= 

[2 , 2  ]T 

• 

m2  = 

[0,0]T  , 

0 

1  0.5 

= 

• 

°2  = 

,  and  x  = 

0.25  ; 

0. 5  1 

(2) 

= 

[2,2]T 

* 

m  2  = 

t—> 

o 

o 

1 

0 

°2  = 

0 

1 

,  and 

X  =  0.5 

* 

(3) 

- 

[U]1 

• 

m  ^ 

[0,0]T  , 

1  0 

[*1 

-0.75 

D1 

0  1 

’ 

°2 

-0.75 

1 

and 

X  = 

0. 

25; 

(4) 

= 

[1,1]T  . 

m2  ' 

[0,0]T  , 

1  0 

1  0.5 

1 

°1 

=r 

Do  = 

,  and  \  - 

0.25  . 

0  1 

C 

0.5  1 

_ 

. 

_ 

For  p  -  5,  the  Gaussian  mixture  parameters  were 


0)  rn1  =  [4,4.4, 4, 4]*,  m2  =  [0,0,0,0,0]T  , 


D-,  -  Ir  ,  0. 


1 .0 

0.5 

1 .0 

0.25 

0.5 

1 .0 

0.125 

0.25 

0.5 

1.0 

0.0625 

0.125 

0.25 

0.5 

and  X  =  0.25; 

(2)  _  _  r, 


,3,3,3, 

3]T 

» 

m2  = 

[0,1 

'l  =  °2 

=  I 

5  ’ 

and 

X  =  1 

,  =  [0,0 

,0, 

0,0] 

T,o, 

=  I 

=  0.5; 

,2,2,2, 

2]T 

» 

m2  * 

[0,1 

1  .0 

0.5 

1  . 

0 

0.25 

0. 

5 

1  .0 

0.125 

0. 

25 

0.5 

1.0 

0.0625 

0. 

625 

0.25 

0.5 

1  5 


,  and  X.  =  0.25. 


These  mixture  distributions  are  denoted  by  MX! ,  MX2,  MX3,  and  MX4. 
Based  on  2000  trials,  the  above  distributions  were  examined  for  sample 
sizes  20,  50,  and  100.  Tables  5.9  to  5.16  (a-d)  present  the  power 
results  for  a  =  0.1,  0.05,  0.025,  and  0.01. 


V'  \ 


Comparing  Tables  5.9  to  5.16  (a  and  b)  with  the  results  of  a  power 
study  earned  out  by  Paulson,  Roohan,  Hwang  and  Fuller  (1987),  it  can 
be  seen  that  the  proposed  test  has  excellent  power  against 
heavy-tailed  symmetric  alternatives  and  good  power  against 
nonsymmetric  alternatives.  In  general,  the  power  of  T^c)  increases 
as  c  increases  since  the  procedure  becomes  more  critical  of 
observations  in  the  tails  of  the  distribution.  The  exception  is  the 
second  mixture  distribution,  MX2,  where  the  power  of  T^(c)  increases 
as  c  decreases.  For  MX2,  an  equal  number  of  observations  are  drawn 
from  Gaussian  distributions  with  equal  covariance  matricies  but  with 
unequal  means.  The  observations  from  MX2  that  are  near  the  mean 
0.5(m1  +  m^)  can  be  considered  as  inliers  of  the  distribution. 

Since  the  model -critical  procedure  becomes  increasingly  critical  of 
inliers  as  c  <  0  decreases,  the  power  of  T-j(c)  increases  as  c 
decreases.  Finally,  it  will  be  noted  that  for  |c|  >  0.1  the  power 
does  not  increase  dramatically  in  general.  The  implication  is  that, 
for  hypothesis  testing,  small  values  of  |c|  are  adequate.  In  the  next 
section,  it  will  be  shown  that  T^(c)  is  a  good  test  of  normality  for 
the  residuals  from  a  linear  model  when  |c|  <  0.1.  In  this  way, 

T^c)  provides  a  joint  test  of  fit  for  the  joint  character  of  the 
errors  and  the  assumed  parametric  model. 


>M 


TABLE  5.9b 

Power  of  the  Test  for  Normality  T -j  ( c )  at 
Significance  Level  a  =  0.05,  n  =  20  and  p  ; 


Alternative  0.3 


Lognormal 


(ill 


.63 

0.60 

.13 

0.12 

.15 

0.14 

.23 

0.21 

.38 

0.36 

.12 

0.11 

.15 

0.13 

.19 

0.18 

.25 

0.24 

.40 

0.38 

‘5. 

i 

s 

s 
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TABLE  5.9c 


Power  of  the  Test  for  Normality  T^c)  at 
Significance  Level  a  =  0.025,  n  =  20  and  p  =  1 


Alternative 

0.3 

0. 

2 

0.1 

0.05 

0.01 

-0.01 

-0.05 

-0.1 

Lognorma 1 

0.69 

0. 

66 

0.63 

0.60 

0.58 

0.58 

0.56 

0.53 

T(  9) 

0.10 

0. 

10 

0.10 

0.09 

0.09 

0.09 

0.09 

0.08 

T  ( 7 ) 

0.13 

0. 

13 

0.13 

0.12 

0.12 

0.12 

0.12 

0.11 

T(  5) 

0.19 

0. 

19 

0.19 

0.18 

0.18 

0.18 

0.18 

0.17 

T(3) 

0.37 

0. 

36 

0.35 

0.34 

0.33 

0.33 

0.32 

0.30 

Xf (14) 

0.09 

0. 

09 

0.09 

0.08 

0.08 

0.08 

0.08 

0.07 

X2(10) 

0.12 

0. 

12 

0.11 

0.10 

0.10 

0.10  i 

0.09 

0.09 

X2(6) 

0.17 

0. 

17 

0.17 

0.15 

0.15 

0.14  i 

0.13 

0.16 

X^(4) 

0.26 

0. 

24 

0.23 

0.21 

0.21 

0.20  i 

0.19 

0.18 

X2(2) 

0.44 

0. 

42 

0.38 

0.36 

0.34 

0.34  i 

0.33 

0.31 

MX1 

0.02 

0. 

02 

0.02 

0.02 

0.02 

0.02 

0.02 

0.02 

MX2 

0.01 

0. 

01 

0.01 

0.01 

0.01 

0.01 

0.01 

0.02 

MX3 

0.12 

0. 

12 

0.12 

0.10 

0.10 

0.10 

0.10 

0.09 

MX4 

0.02 

0. 

02 

0.02 

0.03 

0.03 

0.03 

0.03 

0.02 
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TABLE  5.10a 

Power  of  the  Test  for  Normality  T-|(c)  at 
Significance  Level  a  =  0.1,  n  =  50  and  p  =  1 


Alternative 

0 

3 

0 

2 

0 

1 

0 

05 

0 

01 

-C 

).01 

-( 

)  .05 

-0.1 

Lognorma 1 

0 

99 

0 

98 

0 

97 

0 

96 

0 

95 

0 

95 

0 

93 

0 

92 

T(9) 

0 

30 

0 

30 

0 

29 

0 

28 

0 

27 

0 

26 

0 

25 

0 

24 

T  ( 7 ) 

0 

30 

0 

30 

0 

29 

0 

36 

0 

35 

0 

35 

0 

34 

0 

31 

T(5) 

0 

55 

0 

54 

0 

52 

0 

51 

0 

50 

0 

50 

0 

49 

0 

47 

T  C  3 ) 
X2(14) 

0 

79 

0 

78 

0 

76 

0 

75 

0 

74 

0 

74 

0 

73 

0 

71 

0 

31 

0 

30 

0 

29 

0 

28 

0 

27 

0 

27 

0 

26 

0 

25 

X2(10) 

0 

37 

0 

36 

0 

35 

0 

34 

0 

32 

0 

32 

0 

31 

0 

29 

X2(6) 

0 

50 

0 

48 

0 

45 

0 

44 

0 

42 

0 

41 

0 

40 

0 

38 

X?(4) 

0 

64 

0 

61 

0 

57 

0 

55 

0 

53 

0 

53 

0 

50 

0 

47 

X2(2) 

0 

85 

0 

83 

0 

79 

0 

76 

0 

74 

0 

73 

0 

71 

0 

68 

MX1 

0 

07 

0 

07 

0 

08 

0 

07 

0 

07 

0 

08 

0 

08 

0 

08 

MX2 

0 

14 

0 

15 

0 

18 

0 

18 

0 

20 

0 

21 

0 

22 

0 

24 

MX3 

0 

44 

0 

41 

0 

38 

0 

36 

0 

34 

0 

34 

0 

32 

0 

28 

MX4 

0 

10 

0 

10 

0 

10 

0 

10 

0 

10 

0 

10 

0 

11 

0 

11 

55?! 
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TABLE  5.10b 

Power  of  the  Test  for  Normality  T-|(c)  at 
Significance  Level  a  =  0.05,  n  =  50  and  p  =  1 


Alternative 

0. 

3 

0. 

.2 

0, 

.1 

0, 

.05 

0. 

,01 

-0.01 

-0.05 

-0.1 

Lognormal 

0. 

98 

0. 

.97 

0 

.95 

0 

.95 

0. 

.94 

0. 

,93 

0.92 

0. 

,91 

T  ( 9 ) 

0. 

23 

0. 

.23 

0 

.23 

0 

.22 

0. 

.22 

0. 

.21 

0.21 

0. 

,20 

T(  7 ) 

0. 

32 

0, 

.33 

0 

.31 

0. 

.30 

0. 

.30 

0. 

.29 

0.28 

0. 

.27 

T(  5) 

0. 

18 

0. 

.48 

0 

.47 

0 

.46 

0, 

.45 

0, 

.45 

0.43 

0. 

.42 

T  ( 3 ) 

X? (14) 

0. 

74 

0. 

.73 

0, 

.72 

0 

.72 

0. 

.71 

0. 

.70 

0.69 

0. 

,67 

0. 

22 

0. 

.22 

0, 

.21 

0. 

.21 

0. 

.20 

0. 

.20 

0.20 

0. 

,19 

X?( 10) 

0. 

29 

0. 

.30 

0. 

.28 

0 

.27 

0. 

.26 

0. 

.25 

0.25 

0. 

,24 

X2(6) 

0. 

43 

0, 

.42 

0 

.39 

0, 

.38 

0. 

.36 

0. 

.36 

0.34 

0. 

,32 

X2(4) 

0. 

56 

0. 

.54 

0. 

.52 

0 

.49 

0. 

.47 

0. 

.47 

0.45 

0. 

43 

X2(2) 

0. 

82 

0. 

.78 

0. 

.74 

0, 

,72 

0. 

70 

0. 

,69 

0.67 

0. 

64 

MX1 

0. 

03 

0. 

.03 

0. 

.03 

0 

.03 

0. 

,03 

0. 

.03 

0.03 

0. 

.03 

MX2 

0. 

01 

0. 

.02 

0. 

.03 

0 

.04 

0. 

.05 

0. 

.06 

0.07 

0. 

.11 

MX3 

0. 

35 

0. 

33 

0. 

,31 

0, 

.29 

0. 

28 

0. 

27 

0.25  ■ 

0. 

24 

MX4 

0. 

04 

0. 

04 

0. 

,04 

0. 

.04 

0. 

,04 

0. 

,04 

0.04 

0. 

05 

TABLE  5 . T  Od 

Power  of  the  Test  for  Normality  T-|(e)  at 
Significance  Level  a  -  0.01,  n  =  50  and  p  =  1 


c 


Alternative 

■  0 

.3 

0 

.2 

0 

.1 

0 

.05 

0 

01 

-0.01 

m 

Lognorma 1 

0. 

.95 

0 

.93 

0. 

.90 

0. 

.87 

0. 

86 

0, 

.86 

0, 

.84 

0. 

.83 

T<9) 

0 

.10 

0 

.11 

0 

.11 

0 

.10 

0. 

10 

0 

.10 

0 

.10 

0 

.10 

T  ( 7 ) 

0 

.17 

0 

.17 

0 

.17 

0 

.17 

0 

17 

0 

.17 

0 

.17 

0 

.17 

T(  5) 

0 

.32 

0 

.31 

0 

.30 

0. 

.30 

0. 

29 

0 

.29 

0. 

.29 

0. 

.29 

T(3) 

X2(14) 

0. 

.60 

0 

.59 

0, 

.56 

0, 

,  55 

0. 

54 

0. 

.54 

0. 

.53 

0. 

.52 

0. 

.10 

0 

.10 

0 

.10 

0. 

.09 

0 

09 

0 

.09 

0 

.09 

0. 

x;(10) 

0. 

.15 

0 

.16 

0. 

.15 

0. 

.14 

0. 

13 

0. 

.13 

0, 

.13 

0. 

.13 

X^(6) 

0. 

.25 

0. 

.25 

0. 

.23 

0. 

.22 

0. 

21 

0. 

.21 

0. 

,20 

0. 

.19 

X^(4) 

0. 

40 

0, 

,38 

0. 

,34 

0. 

32 

0. 

31 

0. 

.30 

0. 

,29 

0. 

X2(2) 

0 

69 

0. 

.65 

0. 

60 

0. 

56 

0. 

54 

0. 

53 

0. 

52 

0. 

MX1 

0. 

01 

0. 

01 

0. 

01 

0. 

01 

0. 

01 

0. 

01 

0. 

0. 

MX2 

0. 

00 

0. 

.00 

0. 

00 

0. 

00 

0. 

00 

0. 

00 

0. 

0. 

MX3 

0. 

16 

0. 

14 

0. 

13 

0. 

12 

0. 

1 1 

0. 

1 1 

0. 

1 1 

0. 

1 1 

MX4 

0. 

01 

0. 

,01 

0. 

01 

0. 

01 

0. 

01 

0. 

01 

0. 

0. 
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TABLE  5.11a 

Power  of  the  Test  for  Normality  T-|(c)  at 
Significance  Level  a  =  0.1,  n  =  100  and  p  = 


Alternative  0.3 


-0.05 


Lognorma 1 

1  .00 

1  .00 

1  .00 

1  .0 

1(9) 

0.42 

0.43 

0.42 

0.4 

1(7) 

0.57 

0.57 

0.55 

0.5 

T(  5) 

0.77 

0.76 

0.74 

0.7 

T(3) 

0.95 

0.95 

0.94 

0.9 

X‘(14) 

0.39 

0.40 

0.38 

0.3 

X?(10) 

0.50 

0.50 

0.49 

0.4 

X2(6) 

0.68 

0.65 

0.62 

0.6 

X‘(4) 

0.83 

0.82 

0.79 

0.7 

X2(2) 

0.98 

0.97 

0.95 

0.9 

MX  1 

0.08 

0.08 

0.08 

0.0 

0 

1 

.00 

0. 

99 

0 

0 

0 

.40 

0. 

39 

0 

2 

0 

.52 

0. 

52 

0 

3 

0 

.73 

0. 

72 

0 

3 

0 

.93 

0. 

93 

0 

I 

y; 

.1 
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TABLE  5.11b 

Power  of  the  Test  for  Normality  T  -j  ( c )  at 
Significance  Level  a  =  0.05,  n  =  100  and  p  =  1 


Alternative 

•  0 

.3 

0 

.2 

0 

.1 

0 

.05 

0, 

.01 

-0.01 

-0.05 

-( 

).l 

Lognorma 1 

1  . 

.00 

1  . 

.00 

1  , 

.00 

1  . 

.00 

0, 

.99 

0 

.99 

0. 

.99 

0. 

.99 

T(9) 

0, 

.35 

0. 

.35 

0, 

.36 

0, 

.35 

0. 

.35 

0. 

.34 

0. 

.33 

0. 

.33 

T(7) 

0. 

.49 

0, 

.49 

0. 

.48 

0, 

.48 

0. 

.47 

0. 

.47 

0. 

.45 

0. 

.44 

T  ( 5 ) 

0. 

.71 

0 

.70 

0. 

.70 

0. 

.69 

0. 

,68 

0. 

.68 

0. 

.67 

0. 

.66 

T[3) 

X^(14) 

0 

.94 

0. 

.93 

0 

.93 

0 

.93 

0, 

.92 

0 

.92 

0. 

.91 

0. 

.90 

0 

.31 

0, 

.32 

0, 

.32 

0, 

.31 

0. 

.30 

0. 

.30 

0. 

.29 

0. 

.27 

X2(10) 

0. 

.41 

0. 

.43 

0. 

.43 

0. 

.42 

0. 

.41 

0. 

.40 

0. 

.39 

0. 

.37 

X^(6) 

0 

.60 

0. 

.59 

0, 

.57 

0 

.56 

0. 

.55 

0. 

.54 

0. 

.52 

0. 

.50 

X^(4) 

0. 

.79 

0. 

.78 

0. 

.75 

0, 

.73 

0. 

.71 

0. 

.71 

0. 

.69 

0. 

,66 

X2(2) 

0, 

.97 

0. 

.96 

0, 

.94 

0, 

.93 

0. 

.92 

0. 

.91 

0. 

,90 

0. 

.88 

MX1 

0. 

.03 

0. 

,03 

0. 

.03 

0. 

.03 

0. 

.03 

0. 

03 

0. 

02 

0. 

.03 

MX2 

0. 

.10 

0. 

,10 

0, 

.12 

0, 

.12 

0. 

.13 

0 

.14 

0, 

.14 

0. 

.17 

MX3 

0 

.55 

0. 

.53 

0, 

.49 

0, 

.47 

0. 

.46 

0, 

.45 

0. 

.42 

0. 

.40 

MX4 

0, 

.03 

0. 

.03 

0, 

.03 

0, 

.05 

0, 

.04 

0, 

.04 

0. 

.04 

0. 

.04 

1  .  \ .  <V  r. 
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TABLE  5.11c 

Power  of  the  Test  for  Normality  T-)(c)  at 
Significance  Level  a  =  0.025,  n  =  100  and  p  =  1 


Alternative  0.3 


Lognorma 1 
(9) 

(7) 

(5) 

pi4) 
2(10) 


0. 

.3 

0. 

.2 

0, 

.1 

0, 

.05 

0. 

,01 

-0.01 

-0.05 

-0.1 

1 . 

.00 

1  . 

.00 

1  . 

.00 

1  , 

,00 

i  . 

,00 

1  .00 

1.00 

1  .00 

0 

.28 

0, 

.28 

0 

.29 

0, 

.28 

0. 

.28 

0.27 

0.27 

0.27 

0 

.41 

0, 

.42 

0 

.41 

0, 

.40 

0, 

.40 

0.39 

0.39 

0.38 

0 

.64 

0, 

.65 

0 

.64 

0. 

.63 

0. 

.62 

0.62 

0.61 

0.61 

0. 

.92 

0. 

.91 

0, 

.91 

0, 

.90 

0. 

.89 

0.89 

0.88 

0.88 

0. 

.25 

0. 

.25 

0. 

.25 

0, 

,24 

0. 

.23 

0.23 

0.23 

0.22 

0 

.35 

0, 

.36 

0. 

.35 

0, 

.35 

0. 

.33 

0.33 

0.32 

0.31 

0 

.54 

0. 

.53 

0 

.52 

0, 

.50 

0. 

.48 

0.47 

0.46 

0.44 

0, 

.75 

0. 

.72 

0. 

.69 

0. 

.67 

0. 

.65 

0.64 

0.62 

0.60 

0 

.96 

0. 

.94 

0, 

.92 

0. 

.91 

0. 

.89 

0.89 

0.87 

0.86 

0. 

.02 

0. 

.01 

0. 

.01 

0. 

.01 

0. 

.01 

0.01 

0.01 

0.01 

0. 

.01 

0. 

.01 

0 

.01 

0. 

.01 

0. 

.01 

0.01 

0.02 

0.04 

0. 

.  4o 

.42 

0 

.39 

.37 

0. 

.36 

0.35 

0.33 

m 

.Cl 

0, 

EEs 

El 

TO 
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TABLE  5. lid 

Power  of  the  Test  for  Normality  T-|(c)  at 
Significance  Level  a  =  0.01,  n  =  100  and  p  =  1 

c 


Alternative 

■  0 

.3 

0 

.2 

0 

.1 

0. 

.05 

0 

.01 

-0.01 

-0.05 

-( 

).l 

Lognormal 

1 

.00 

1 

.00 

1  . 

.00 

0 

.99 

0 

.99 

0, 

.99 

0. 

.98 

0. 

.98 

T(  9 ) 

0 

.21 

0 

.22 

0 

.21 

0, 

.20 

0 

.20 

0, 

.20 

0. 

.19 

0. 

.19 

T(  7 ) 

0. 

.31 

0, 

.32 

0. 

.32 

0, 

.30 

0, 

.30 

0. 

.30 

0. 

.30 

0. 

,28 

T(  5) 

0. 

.57 

0, 

.56 

0 

.54 

0, 

.53 

0 

.52 

0. 

.51 

0. 

.51 

0. 

.50 

T(3) 

X2(14) 

0 

.88 

0, 

.87 

0, 

.86 

0, 

.84 

0, 

.84 

0, 

.83 

0. 

.83 

0. 

,81 

0. 

.17 

0. 

.18 

0, 

.18 

0, 

.16 

0, 

.16 

0. 

.16 

0. 

.15 

0. 

.15 

( 10) 

0 

.28 

0 

.28 

0 

.27 

0, 

.25 

0. 

.25 

0. 

.24 

0. 

.24 

0. 

.23 

X2(6) 

0 

.46 

0 

.45 

0 

.43 

0 

.40 

0 

.39 

0. 

.38 

0. 

.37 

0. 

.35 

X|(4) 

0 

.67 

0. 

.64 

0 

.61 

0 

.58 

0 

.56 

0 

.54 

0. 

.53 

0. 

.50 

X2(2) 

0 

.94 

0. 

.92 

0. 

.89 

0. 

.86 

0, 

.84 

0. 

.83 

0. 

.81 

0. 

.79 

MX! 

0. 

.01 

0, 

.00 

0, 

.00 

0. 

.00 

0. 

.00 

0. 

.00 

0. 

,00 

0. 

.00 

MX2 

0 

.00 

0 

.00 

0. 

.00 

0 

.00 

0, 

.00 

0 

.00 

0. 

.00 

0. 

.00 

MX3 

0. 

.33 

0. 

.30 

0. 

28 

0. 

,25 

0. 

,23 

0. 

23 

0. 

22 

0. 

21 

MX4 

0. 

01 

0. 

01 

0. 

01 

0. 

01 

0. 

01 

0. 

01 

0. 

01 

0. 

00 
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TABLE  5.12c 


Power  of  the  Test  for  Normality  T-|(c)  at 
Significance  Level  a  =  0.025,  n  =  20  and  p 


Alternative  0.3 


0.025  0.006 


Lognormal 
(9) 

(7) 

(5) 

13) 
‘(14) 
*(10) 
(6) 


4 

0 

2 

0 

7 

0 

8 

0 

0 

0 

0 

0 

2 

0 

!0 

0.74 

0.71 

3 

0.12 

0.1 

8 

0.17 

O.li 

!8 

0.26 

0.2 

^8 

0.47 

0.4 

1 

0.10 

0.0' 

3 

0.12 

0.1 

'0 

0.19 

0.1 

!9 

0.27 

0.2' 

>0 

0.44 

0.41 

)4 

0.04 

0.0' 

)2 

0.01 

0.0 

3 

0.11 

)4 

0.0 

s  V.V.  «»  V  VV  v  v?/V 


$ 

3 
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TABLE  5.1  2d 

Power  of  the  Test  for  Normality  T -j  ( c )  at 
Significance  Level  a  =  0.01,  n  =  20  and  p  = 


TABLE  5.13a 

Power  of  the  Test  for  Normality  T] ( c )  at 
Significance  Level  a  =  0.1,  n  =  50  and  p  = 


Alternative  0.3 


0.025  0.006 


Lognorma 1 

1.00 

1.00 

0.99 

0.99 

0 

T  | 

[9) 

0.44 

0.44 

0.44 

0.43 

0 

Ti 

[7) 

0.58 

0.58 

0.57 

0.56 

0 

Tl 

[5) 

0.75 

0.75 

0.74 

0.73 

0 

Ti 

S3) 

0.94 

0.94 

0.93 

0.92 

0 

X: 

(14) 

0.35 

0.36 

0.36 

0.36 

0 

X: 

(10) 

0.45 

0.46 

0.44 

0.43 

0 

X‘ 

(6) 

0.62 

0.62 

0.59 

0.57 

0 

X: 

(4) 

0.75 

0.74 

0.71 

0.69 

0 

X‘ 

-(2) 

0.95 

0.93 

0.91 

0.90 

0 

m; 

<1 

0.16 

0.16 

0.15 

0.15 

0 

M) 

<2 

0.06 

0.06 

0.09 

0.17 

0 

M) 

<3 

0.58 

0.47 

0.43 

M) 

<4 

0.12 

0.12 

9 

0. 

.99 

0 

3 

0. 

,42 

0 

5 

0. 

.55 

0 

2 

0. 

.72 

0 

2 

0. 

.92 

0 

5 

0. 

,35 

0 

3 

0. 

.42 

0 

6 

0, 

.56 

0 

8 

0. 

.68 

0 

9 

0. 

.89 

0 

5 

0. 

.15 

0 

9 

0. 

.20 

0 

1 

0. 

40 

0 
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TABLE  5.13b 

Power  of  the  Test  for  Normality  T](c)  at 
Significance  Level  a  =  0.05,  n  =  50  and  p  =  2 


c 


Alternative 

i  0 

.3 

0 

.2 

0 

.1 

0 

.025 

0 

.006 

-0.006 

-0.025 

-0.1 

Lognorma 1 

1 

.00 

0 

.99 

0 

.99 

0, 

.98 

0 

.98 

0.98 

0.98 

0. 

.97 

T  ( 9 ) 

0 

.34 

0 

.34 

0 

.34 

0 

.34 

0 

.3-; 

0.34 

0.33 

0 

.31 

T(7) 

0 

.47 

0 

.48 

0 

.48 

0 

.46 

0 

.46 

0.46 

0.45 

0 

.42 

T(5) 

0 

.67 

0 

.66 

0 

.66 

0 

.65 

0 

.65 

0.65 

0.64 

0 

.62 

T(3) 

%( 1 4) 

0 

.92 

0 

.91 

0 

.90 

0 

.90 

0 

.90 

0.89 

0.89 

0 

.88 

0 

.25 

0 

.26 

0 

.26 

0, 

.26 

0 

.26 

0.26 

0.26 

0 

.24 

X|(10) 

0 

.35 

0 

.35 

0 

.34 

0, 

.33 

0 

.33 

0.32 

0.32 

0, 

.30 

X|(6) 

0, 

.52 

0. 

.51 

0 

.49 

0. 

.48 

0 

.47 

0.47 

0.46 

0. 

.43 

X|(4) 

0. 

.68 

0, 

.66 

0 

.64 

0. 

.62 

0 

.61 

0.60 

0.60 

0. 

.56 

X2(2) 

0, 

.92 

0. 

.90 

0 

.87 

0. 

,84 

0 

.84 

0.83 

0.83 

0. 

.79 

MX1 

0. 

.10 

0 

.08 

0 

.08 

0. 

.09 

0 

.09 

0.09 

0.09 

0. 

.08 

MX2 

0. 

02 

0. 

03 

0 

.03 

0. 

.03 

0 

.03 

0.04 

0.04 

0. 

.07 

MX3 

0. 

47 

0. 

41 

0. 

33 

0. 

30 

0. 

29 

0.29 

0.28 

0. 

24 

MX4 

0. 

07 

0. 

07 

0. 

07 

0. 

07 

0. 

07 

0.07 

0.07 

0. 

07 
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TABLE  5.13d 


Power  of  the  Test  for  Normality  T-|(c)  at 
Significance  Level  a  =  0.01,  n  =  50  and  p  =  2 


Alternative 

0.3 

0.2 

0.1 

0.025 

0.006 

-0.006 

-0. 

025 

-0. 

1 

Lognormal 

0.99 

0.98 

0.97 

0.96 

0.95 

0.95 

0. 

95 

0.93 

T  ( 9 ) 

0.17 

0.19 

0.19 

0.19 

0.19 

0.19 

0. 

1 

8 

0.1 

8 

T(  7 ) 

0.30 

0.30 

0.30 

0.29 

0.29 

0.29 

0. 

2 

9 

0.2 

8 

T(5) 

U.4a 

0.48 

0.47 

0.46 

0.46 

0.45 

0. 

4 

5 

0.4 

4 

T(  3) 

0.83 

0.82 

0.80 

0.79 

0.78 

0.78 

0. 

7 

8 

0.7 

7 

X|(14) 

0.12 

0.13 

0.13 

0.12 

0.12 

0.12 

0. 

1 

2 

0.1 

2 

Xf(10) 

0.18 

0.19 

0.18 

0.18 

0.18 

0.18 

0. 

1 

8 

0.1 

7 

X|<6) 

0.33 

0.33 

0.32 

0.31 

0.30 

0.30 

0. 

3 

0.2 

8 

X|(4) 

0.50 

0.49 

0.47 

0.44 

0.43 

0.43 

0. 

4 

3 

0.4 

a 

X2(2) 

0.84 

0.80 

0.75 

0.71 

0.70 

0.69 

0. 

6 

8 

0.6 

4 

MX! 

0.03 

0.03 

0.03 

0.03 

0.03 

0.03 

0. 

s 

0 

MX2 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

0. 

S 

0.0 

1 

MX3 

0.27 

0.20 

0.15 

0.12 

0.12 

0.11 

0. 

i 

1 

0.0 

9 

MX4 

0.02 

0.02 

0.02 

0.02 

0.02 

0.02 

0. 

E 

0.0 

2 

r>  r^wv^vvvwYVT, 


TABLE  5.14a 


Power  of  the  Test  for  Normality  T-j(c)  at 
Significance  Level  a  =  0.1,  n  =  100  and  p  =  2 


& 


Alternative 

0.3 

0.2 

0.1 

0.025 

0.006 

-0.006 

-0.025 

-0. 

1 

lognorma 1 

1  .00 

1  .00 

1  .00 

1  .00 

1  .00 

1  .00 

1  .00 

1  . 

00 

T  ( 9 ) 

0.68 

0.67 

0.66 

0.65 

0.64 

0  64 

0.64 

0. 

60 

T  ( 7 ) 

0.81 

0.79 

0.78 

0.78 

0.77 

0.77 

0.77 

0. 

74 

T(5) 

0.94 

0.94 

0.93 

0.92 

0.92 

0.92 

0.91 

0. 

9 

T(3) 

1  .00 

1  .00 

1  .00 

1  .00 

1  .00 

1  .00 

1  .00 

1 . 

E 

X?(14) 

0.48 

0.48 

0.48 

0.47 

0.47 

0.46 

0.45 

0. 

4 

2 

X2( 10) 

0.61 

0.60 

0.59 

0.58 

0.57 

0.57 

0.57 

0. 

5 

3 

X2(6) 

0.83 

0.82 

0.80 

0.78 

0.77 

0.77 

0.76 

0. 

7 

2 

X^(4) 

0.93 

0.93 

0.91 

0.89 

0.88 

0.88 

0.87 

0. 

8 

5 

X2(2) 

m 

1  .00 

0.99 

0.99 

0.99 

0.99 

0.98 

0. 

9 

8 

M 

XI 

0.2 

3 

0.22 

0.22 

0.22 

0.22 

0.22 

0.22 

0. 

2 

0 

M 

X2 

0.3 

m 

0.30 

0.33 

0.40 

0.41 

0.42 

0.43 

0. 

4 

8 

M 

X3 

3 

0.77 

0.68 

0.61 

0.59 

0.57 

0.55 

0. 

4 

7 

M 

X4 

0.1 

8 

0.17 

0.17 

0.18 

0.18 

0.18 

0.18 

0. 

1 

8 

H 
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TABLE  5.14b 

Power  of  the  Test  for  Normality  T -j ( c )  at 
Significance  Level  a  =  0.05,  n  =  100  and  p  =  2 


Alternative 

■  0 

.3 

0, 

.2 

0, 

,1 

0. 

,025 

0. 

,006 

-C 

) .  006 

-0.025 

-0.1 

Lognormal 

1 

.00 

1  . 

.00 

1  . 

,00 

1  , 

.00 

1  , 

,00 

1  . 

00 

1  . 

.00 

1  . 

.00 

T(  9) 

0. 

.58 

0, 

.58 

0, 

,59 

0. 

.55 

0. 

,55 

0. 

.55 

0, 

.54 

0. 

.59 

T  ( 7 ) 

0. 

.74 

0, 

.73 

0, 

.73 

0, 

.70 

0. 

.69 

0. 

.69 

0. 

.69 

0. 

.73 

T(  5) 

0. 

.90 

0. 

.90 

0 

.90 

0, 

.88 

0. 

.88 

0. 

,88 

0. 

.88 

0. 

.90 

T  C  3 ) 

(14) 

1 . 

.00 

1  . 

.00 

1 , 

.00 

0, 

,99 

0. 

.99 

0. 

.99 

0. 

.99 

1 . 

,00 

0. 

.38 

0. 

.39 

0, 

.40 

0, 

.37 

0. 

.36 

0. 

.36 

0. 

.36 

0. 

.40 

x!(io) 

0, 

.51 

0. 

,51 

0, 

.52 

0. 

.49 

0. 

.48 

0. 

.47 

0. 

.47 

0. 

.52 

X2(6) 

0. 

.75 

0. 

,75 

0. 

.75 

0, 

.71 

0. 

.70 

0. 

.70 

0. 

.69 

0. 

.75 

X?(4) 

0, 

.90 

0. 

,89 

0. 

,88 

0 

.85 

0 

.84 

0. 

.84 

0. 

.83 

0. 

.88 

X2(  2) 

1 . 

.00 

0. 

,99 

0. 

,99 

0, 

.98 

0. 

.98 

0. 

98 

0. 

.98 

0. 

.99 

MX1 

0. 

.14 

0. 

15 

0. 

16 

0. 

.14 

0. 

.14 

0. 

.14 

0. 

,14 

0. 

18 

MX2 

0. 

,06 

0. 

06 

0. 

10 

0. 

,09 

0. 

.09 

0. 

,10 

0. 

.10 

0. 

,15 

MX3 

0. 

.76 

0. 

67 

0. 

59 

0. 

.48 

0. 

.45 

0. 

.45 

0. 

.42 

0. 

.34 

MX4 

0. 

.10 

0. 

11 

0. 

13 

0. 

12 

0. 

.12 

0. 

12 

0. 

.12 

0. 

11 

w. 

Lognormal 

1  .00 

1  .00 

1  .00 

1  .00 

1  .00 

1  .00 

1  .00 

1  .00 

& 

T(9) 

0.47 

0.49 

0.47 

0.45 

0.44 

0.44 

0.44 

0.42 

to 

T(  7) 

0.65 

0.66 

0.65 

0.62 

0.61 

0.60 

0.60 

0.58 

&s 

T(5) 

0.87 

0.87 

0.85 

0.83 

0.82 

0.82 

0.82 

0.80 

T[3) 

Xf ( 14) 

1  .00 

1  .00 

0.99 

0.99 

0.99 

0.99 

0.99 

0.98 

0.29 

0.31 

0.31 

0.29 

0.29 

0.28 

0.28 

0.27 

* 

X2(10) 

0.42 

0.44 

0.43 

0.40 

0.40 

0.39 

0.39 

0.36 

:  x|(6) 

0.68 

0.69 

0.66 

0.63 

0.62 

0.61 

0.61 

0.56 

X^(4) 

0.86 

0.86 

0.83 

0.79 

0.79 

0.78 

0.77 

0.73 

ft 

X2(  2) 

0.99 

0.99 

0.98 

0.97 

0.97 

0.96 

0.96 

0.94 

§ 

MX  1 

0.09 

0.10 

0.10 

0.09 

0.09 

0.09 

0.09 

0.08 

MX2 

0.02 

0.02 

0.01 

0.01 

0.01 

0.01 

0.01 

0.02 

§ 

MX3 

0.65 

0.58 

0.45 

0.34 

0.32 

0.30 

0.30 

0.24 

MX4 

0.06 

0.07 

0.08 

0.07 

0.07 

0.07 

0.07 

0.07 

TABLE  5 . 1 4d 


Power  of  the  Test  for  Normality  T-j(c)  at 
Significance  Level  a  =  0.01,  n  =  100  and  p  =  2 


Alternative 

■  0. 

.3 

0 

.2 

0. 

.1 

0 

.025 

0. 

.006 

-0.006 

-0.025 

-0.1 

Lognorma 1 

1  . 

.00 

1  . 

.00 

1 

.00 

1  , 

.00 

1  . 

,00 

1  . 

.00 

1  . 

.00 

1  . 

.00 

T  (9) 

0. 

.38 

0 

.37 

0 

.36 

0 

.33 

0. 

.33 

0. 

.33 

0. 

.32 

0. 

.32 

T  ( 7 ) 

0, 

.58 

0 

.  57 

0. 

.55 

0, 

.52 

0. 

.51 

0. 

.51 

0. 

,51 

0. 

.50 

T(5) 

0. 

.82 

0, 

.80 

0, 

.79 

0 

.75 

0. 

.75 

0, 

.75 

0. 

.74 

0. 

.72 

T 1 3 ) 

X2( 1 4 ) 

0, 

.99 

0 

.99 

0 

.98 

0 

.98 

0, 

.98 

0, 

.97 

0. 

.97 

0. 

.97 

0. 

.22 

0, 

.22 

0. 

.22 

0. 

,20 

0. 

.19 

0. 

,19 

0. 

.19 

0. 

.18 

X2( 1 0) 

0. 

.35 

0. 

.35 

0 

.34 

0 

.31 

0. 

,30 

0. 

.30 

0. 

,29 

0. 

.28 

X*(6) 

0. 

,61 

0 

.59 

0. 

.56 

0 

.51 

0. 

.50 

0. 

.50 

0. 

.48 

0. 

.46 

X2(4) 

0. 

.82 

0, 

.79 

0, 

.76 

0, 

.  70 

0. 

.69 

0. 

.68 

0. 

.68 

0. 

.65 

X2(2) 

0. 

.99 

0 

.99 

0. 

.98 

0 

.94 

0. 

,94 

0. 

.93 

0. 

.93 

0. 

.94 

MX  1 

0. 

.06 

0. 

.06 

0. 

.05 

0 

.05 

0. 

.04 

0. 

.04 

0. 

.04 

0. 

.04 

MX2 

0. 

.00 

0. 

.01 

0. 

.01 

0 

.00 

0. 

.00 

0 

.00 

0. 

.00 

0 

.01 

MX3 

0. 

.57 

0. 

.44 

0. 

.31 

0 

.20 

0, 

.20 

0. 

.18 

0. 

.17 

0. 

.14 

MX4 

0. 

.04 

0. 

.04 

0. 

,04 

0. 

.03 

0. 

.03 

0. 

03 

0. 

03 

0. 

.04 
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TABLE  5.15b 

Power  of  the  Test  for  Normality  T-|(c)  at 
Significance  Level  a  =  0.05,  n  =  50  and  p  =  5 


c 


Alternative 

-  0. 

.3 

0. 

.2 

0, 

.1 

0. 

.004 

0. 

,002 

-( 

1.002 

-( 

1.004 

-0.1 

Lognormal 

1  , 

.00 

1  , 

.00 

1  . 

.00 

1 

.00 

1  . 

.00 

1  . 

.00 

1  . 

.00 

1  . 

.00 

T(9) 

0. 

.61 

0. 

.67 

0. 

.62 

0, 

.61 

,  '\ 

,61 

0. 

,61 

0. 

.61 

0. 

.58 

T(  7) 

0 

.77 

0, 

.81 

0 

.78 

0 

.75 

o' 

,75 

0. 

,75 

0. 

.75 

0. 

.71 

T(  5) 

0, 

.92 

0, 

.94 

0, 

.92 

0, 

.91 

0. 

,91 

0. 

,91 

0. 

.91 

0. 

.89 

T{3) 

X2(14) 

1 . 

.00 

1 . 

.00 

1 . 

.00 

0, 

.99 

0. 

,99 

0. 

.99 

0, 

.99 

0. 

.98 

0, 

.22 

0. 

.30 

0. 

.28 

0 

.27 

0. 

,27 

0. 

.27 

0. 

.27 

0. 

.26 

X2( 1 0) 

0, 

.32 

0. 

.41 

0, 

.37 

0, 

.36 

0. 

,36 

0. 

.36 

0. 

.36 

0. 

.34 

X2(6) 

0. 

.57 

0. 

.63 

0. 

.58 

0, 

.54 

0. 

,54 

0. 

.54 

0. 

,54 

0. 

.50 

X^(4) 

0, 

.79 

0, 

.81 

0. 

.75 

0 

.71 

0. 

.71 

0. 

,71 

0, 

,71 

0. 

.66 

X2(2) 

0. 

.99 

0, 

.98 

0. 

.96 

0. 

.93 

0. 

,93 

0. 

93 

0. 

,93 

0. 

,90 

MX1 

0. 

.55 

0. 

.38 

0. 

.27 

0. 

.23 

0. 

,23 

0. 

23 

0. 

.23 

0. 

.21 

MX2 

0. 

.03 

0. 

.04 

0. 

.04 

0 

.04 

0. 

,05 

0. 

.05 

0. 

.05 

0, 

.08 

MX3 

0. 

.92 

0 

.89 

0. 

.82 

0 

.75 

0. 

.75 

0. 

.74 

0, 

.74 

0, 

.63 

MX4 

0 

.25 

0. 

.25 

0 

.22 

0 

.21 

0. 

.21 

0. 

.20 

0. 

.20 

0. 

.20 
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TABLE  5.15c 

Power  of  the  Test  for  Normality  T-|(c)  at 
Significance  Level  a  =  0.025,  n  =  50  and  p  =  5 


c 


Alternative 

'  0. 

.3 

0. 

.2 

0. 

.1 

0. 

,004 

0. 

,002 

-C 

) .  002 

-0.004 

-0.1 

Lognorma 1 

1  . 

.00 

1  , 

.00 

1  . 

.00 

1  , 

.00 

1  . 

.00 

1  . 

.00 

1  . 

00 

1  . 

.00 

T(9) 

0. 

.49 

0. 

.57 

0. 

.54 

0. 

,52 

0. 

,52 

0. 

.52 

0. 

.52 

0. 

.47 

T(  7) 

0. 

.66 

0. 

.73 

0. 

.69 

0. 

.66 

0. 

66 

0. 

,66 

0. 

,66 

0. 

.62 

T(  5) 

0. 

.88 

0 

.90 

0, 

.89 

0, 

.87 

0. 

.87 

0. 

.87 

0. 

.87 

0. 

.83 

T(3) 

X2(14) 

0. 

.99 

0 

.99 

0. 

.99 

0, 

.99 

0. 

.99 

0. 

,99 

0. 

.99 

0. 

.98 

0. 

.14 

0, 

.20 

0, 

.20 

0, 

.19 

0. 

.19 

0. 

.19 

0. 

,19 

0. 

.18 

xj(10) 

0. 

.22 

0, 

.29 

0. 

.28 

0. 

.27 

0. 

.27 

0. 

.27 

0. 

,27 

0. 

.25 

Xj(6) 

0, 

.44 

0 

.52 

0. 

.47 

0 

.44 

0, 

.44 

0. 

.44 

0. 

.44 

0. 

.39 

X^(4) 

0 

.70 

0 

.72 

0. 

.67 

0 

.62 

0, 

.61 

0. 

.61 

0. 

.61 

0. 

.55 

X2(2) 

0, 

.97 

0 

.97 

0. 

.93 

0, 

.89 

0. 

.89 

0. 

.89 

0. 

.89 

0. 

MX! 

0. 

.48 

0. 

.27 

0. 

.19 

0. 

.16 

0. 

,16 

0. 

.16 

0. 

,16 

0. 

.13 

MX2 

0. 

,01 

0. 

,01 

0. 

.01 

0. 

.02 

0. 

.02 

0. 

.02 

0. 

.02 

0. 

ItXl 

MX3 

0. 

86 

0. 

.82 

0. 

,71 

0. 

61 

0. 

61 

0. 

.61 

0. 

61 

0. 

36 

MX4 

0. 

,17 

0. 

.18 

0. 

.15 

0. 

,14 

0. 

14 

0. 

14 

0. 

14 

0. 

12 
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Power  of  the  Test  for  Normality  T-|(c)  at 
Significance  Level  a  =  0.01,  n  =  50  and  p  =  5 


Alternative 

0 

3 

0 

2 

0 

1 

0 

004 

0 

002 

-( 

) .  002 

— 

-( 

).l 

Lognorma 1 

1 

00 

1 

00 

1 

00 

1 

00 

1 

00 

1 

00 

1 

.00 

0 

99 

T(  9 ) 

0 

36 

0 

43 

0 

40 

0 

41 

0 

41 

0 

40 

0 

.40 

0 

37 

T(  7) 

0 

53 

0 

60 

0 

57 

0 

56 

0 

56 

0 

56 

0 

.56 

0 

52 

T(5) 

0 

80 

0 

85 

0 

82 

0 

80 

0 

80 

0 

79 

0 

.79 

0 

76 

T(3) 

Xf( 14) 

0 

99 

0 

99 

0 

98 

0 

98 

0 

98 

0 

98 

0 

.98 

0 

96 

0 

07 

0 

12 

0 

12 

0 

13 

0 

13 

0 

13 

0 

.13 

0 

11 

X2(10) 

0 

13 

0 

19 

0 

19 

0 

19 

0 

19 

0 

19 

0 

.19 

0 

16 

X2(6) 

0 

31 

0 

38 

0 

34 

0 

33 

0 

33 

0 

33 

0 

.33 

0 

29 

x2(4) 

0 

58 

0 

61 

0 

53 

0 

51 

0 

51 

0 

50 

0 

.50 

0 

45 

X2(2) 

0 

95 

0 

93 

0 

88 

0 

84 

0 

84 

0 

84 

0 

.84 

0 

78 

MX1 

0 

43 

0 

17 

0 

11 

0 

10 

0 

10 

0 

10 

0 

.10 

0 

08 

MX2 

0 

01 

0 

01 

0 

01 

0 

01 

0 

01 

0 

01 

0 

.01 

0 

01 

MX3 

0 

77 

0 

72 

0 

53 

0 

45 

0 

45 

0 

45 

0 

.45 

0 

36 
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0 

10 

0 

10 

0 

08 

0 
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0 

.08 

0 
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TABLE  5.16a 

Power  of  the  Test  for  Normality  T -j  ( c )  at 
Significance  Level  a  =  0.1,  n  =  100  and  p  =  5 


Lognorma 1 
T(9) 

T  ( 7 ) 

T(5) 

T(3) 

X|d4) 

X2(10) 

X2(6) 

X|(4) 

X2(2) 

MX1 

MX2 

MX3 

MX4 


.00  1.00 

■ 

■ 

.00  1.00 

.94  0.94 

m 

.92  0.92 

.99  0.99 

s 

.98  0.98 

.00  1.00 

.00  1.00 

.00  1.00 

m 

.00  1.00 

.55  0.59 

afcfe 1 

5  0 

.55  0.55 

.72  0.75 

m 

m 

.69  0.69 

.92  0.92 

m 

0  0.8 

m 

.87  0.87 

.98  0.98 

IT 

A 

.96  0.96 

TABLE  5.16c 

Power  of  the  Test  for  Normality  T-|(c)  at 
Significance  Level  a  =  0.025,  n  =  100  and  p  =  5 


Alternative 

■  0, 

.3 

0 

.2 

0. 

.1 

0. 

.004 

0. 

,002 

-0.002 

-0.004 

-0.1 

Lognorma 1 

1  . 

.00 

1 

.00 

1 

.00 

1  , 

.00 

1  . 

,00 

1  . 

.00 

1  .00 

1  . 

.00 

T  ( 9 ) 

0 

.85 

0 

.86 

0, 

.86 

0 

.81 

0, 

.81 

0. 

.81 

0.81 

0. 

.78 

T  ( 7 ) 

0 

.95 

0 

.96 

0, 

.95 

0, 

,93 

0. 

.93 

0, 

.93 

0.93 

0, 

.91 

T(  5) 

1 . 

.00 

1 

.00 

1  . 

.00 

0, 

.99 

0. 

.99 

0. 

.99 

0.99 

0. 

.98 

T{3) 

X|(14) 

1 . 

.00 

1 

.00 

1 

.00 

1 , 

.00 

1 , 

.00 

1 . 

.00 

1  .00 

1 . 

.00 

0. 

.32 

0 

.36 

0, 

.37 

0, 

.34 

0. 

.34 

0. 

.34 

0.34 

0. 

.31 

X2(10) 

0. 

.49 

0 

.55 

0. 

.54 

0, 

.49 

0. 

.49 

0. 

.49 

0.49 

0. 

.44 

X2(6) 

0. 

.80 

0 

.82 

0, 

.80 

0. 

.72 

0. 

.  72 

0. 

.72 

0.72 

0. 

.66 

X2(4) 

0. 

.95 

0 

.95 

0, 

.93 

0, 

.90 

0. 

.90 

0, 

.90 

0.90 

0. 

.84 

X2(2) 

1 . 

.00 

1 

.00 

1 . 

.00 

0, 

.99 

0. 

.99 

0. 

.99 

0.99 

0. 

.99 

MX1 

0. 

.67 

0 

.48 

0. 

,37 

0, 

.27 

0. 

,27 

0. 

,27 

0.26 

0. 

.21 

MX2 

0. 

02 

0. 

,03 

0. 

03 

0. 

03 

0. 

03 

0. 

.03 

0.03 

0. 

.04 

MX3 

1 . 

.00 

1 . 

.00 

0. 

,99 

0. 

,95 

0. 

,95 

0. 

95 

0.95 

0. 

,88 

MX4 

0. 

.32 

0. 

.31 

0. 

.28 

0. 

.22 

0. 

.22 

0. 

22 

0.22 

0. 

.20 
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5.4  Analysis  of  Linear  Models 

In  this  section,  it  will  be  shown  that  the  multivariate  normality 


test  statistic  T^c)  can  be  applied  to  models  such  as  linear  regression 


that  have  additional  structure.  This  is  an  extraordinary  new  result 
since  most  other  tests  for  normality  cannot  be  used  when  the  data  have 
structure  other  than  a  mean  and  covariance.  The  Shapiro-Wilk  test  has 
been  applied  to  two-way  layouts  (see  Gentleman  and  Wilk,  1975);  however, 
for  the  particular  two-way  layout,  percentage  points  were  tabulated  for 
the  test  statistic  W.  It  is  not  practical  to  tabulate  percentage  points 
for  each  model  considered.  By  adjusting  c,  it  will  be  seen  that  the 


test  statistic  T^ ( c )  can  be  made  insensitive  to  the  underlying  model. 


In  this  way,  T ^ ( c )  will  provide  an  approximate  test  of  normality  for 


the  residuals  from  a  linear  model.  This  result  is  especially  useful 
for  multivariate  models  where  probability  plots  are  not  as  effective. 


In  Section  2.3,  a  number  of  models  for  the  errors  were  presented 
with  the  errors  having  the  form 


C.  =  y.  -  h( x . ;8) 
1  1 1  '  1 


(4.1) 


where 


c  . 

1 

i  s 

a 

P 

X 

1 

vector 

of 

errors , 

Vi 

i  s 

a 

P 

X 

1 

vector 

of 

observations , 

x . 

i  s 

a 

q 

X 

1 

vector 

of 

concomitant  variables,  and 

e 

i  s 

a 

q 

X 

1 

vector 

of 

parameters  to  be  estimated  from  the  data 

The  errors  are  assumed  to  be  independent  and  identically  distributed 
Gaussian  random  variables  with  zero  mean  and  covariance  matrix  R.  It 
is  this  assumption  that  the  statistic  T^c)  will  examine.  If 
h(xi ,e)  =  m,  then  (4.1)  is  the  model  used  in  section  5.2. 


For  the  following,  the  density  of  the  errors  is 


f(r.)  =  |2,rR|'1/2exp(-  R_1c./2) 

=  |2irR|“1/2exp[-  (y.  -  h(  Xj  ;0)  )TR_1  (y.  -  h(x.;0))/2l. 


The  generalized  likelihood  1.^(0)  without  the  constant  term,  denoted 
L ( c )  as  in  Section  2.3,  is 


(4.2) 


ELJUL 

2*R 


exp(-c(y.-h(x.;e))V1(yrh(x.;e))/2)  (4.3) 


<C>L  W  «P(-  C«( 


where  a  =  0.5c/(l+-c; 


For  a  proposed  model  h(x;0)  in  L(c),  model-critical  estimates  of  e 
and  R  are  obtained  by  maximizing  (4.3)  over  0  and  R.  For  many  models, 
setting  equal  to  zero  the  derivatives  of  L(c)  with  respect  to  0  and  R 
yields  a  set  of  implicit  equations  which  can  be  solved  via  a  Fixed 
point  algorithm  (see  Sections  2.2  and  2.3).  As  with  the  unstructured 


case,  estimates  R(o)  and  R(c)  of  R  can  be  used  to  obtain  the  test 
statistic 


T^c)  =  n/2  |tr[R(c)R(o)  1  +  R(o)R(c)  ]]  -  2p|  . 


The  tilde  is  used  to  indicate  the  test  statistic  obtained  from  a 
structured  model.  Since  T  ^  ( c )  is  defined  similarly  to  T^c),  the 
statistic  T-|(c)  or  a  function  of  T^(c)  may  have  approximately  the 
same  distribution  as  T^(c).  The  definition  of  T^(c)  indicates 
that  the  effects  on  R(o)  and  R(c)  due  to  estimating  additional 
parameters  should  approximately  cancel.  If  T^(c)  and  T^(c)  have 
the  same  distribution,  T^c)  could  be  used  to  test  the  normality  of 
the  residuals  in  a  linear  model  using  the  percentage  points  of 
T^c).  This  would  eliminate  the  need  to  tabulate  percentage  points 

'V 

of  T, (c)  for  each  model . 
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5 . 5  Monte  Carlo  Analysis  of  (c) 

Monte  Carlo  simulation  was  used  to  analyze  T  (c)  as  a  test  of 
normality  of  the  residuals  from  a  linear  model,  the  emphasis  being  on 
univariate  models.  For  linear  regressions,  autoregressions,  and 
two-way  layouts,  percentage  points  based  on  10,000  trials  were  obtained 
as  a  function  of  c  and  sample  size.  For  linear  regressions,  one  to 
seven  parameter  models  were  examined  for  sample  sizes  of  32,  64,  and 
128.  The  autoregressive  models  had  one  to  eight  parameters  and  sample 
sizes  of  40,  60,  and  120.  For  both  types  of  models,  the  c-values  used 
were  -0.05,  -0.025,  -0.01,  0.01,  0.025  and  0.05.  For  c  =  -0.05, 

-0.025,  0.025,  and  0.05,  Figures  5.1  to  5.12  are  plots  of  the  90  and 
95  percentage  points  of  ( c )  as  a  function  of  model  order  for 
p-parameter  linear  regressions.  The  solid  line  in  each  plot  is  the 
tabulated  percentage  point  of  T n ( c ) .  The  plots  show  that  the 
percentage  points  increase  (decrease)  as  the  number  of  parameters 
increases  for  c  >  o  (c  <  o).  Also,  the  rate  of  increase  (c  >  o)  or 
decrease  (c  <  o)  increases  with  the  magnitude  of  c.  Thus,  the  critical 

w 

parameter  c  controls  the  behavior  of  the  percentage  points  of  T  (c). 
That  is,  for  a  given  model,  the  percentage  points  of  1(c)  can  be 

"tuned"  to  those  of  T^(c)  by  an  appropriate  choice  of  c.  However, 
the  change  in  percentage  points  does  not  dramatically  alter  the  size 
of  the  test,  since  the  percentage  of  observations  of  T  ^  ( c )  exceeding 
the  ath  percentage  point  of  T ^ ( c )  is  approximately  1  -  a. 

Tables  5.17  to  5.22  present  the  size  of  the  test  results  for  ‘ 
p-parameter  linear  regressions  and  autoregress  ions .  Tables  5.23  and 


The  choice  of  appropriate  c  value  for  testing  is  not  clear.  Even 
with  the  guidelines  given  previously,  the  choice  of  critical  parameter 
is  still  more  subjective  than  objective.  Choosing  an  effective  c  value 
depends  on  the  user's  experience  with  model-critical  methods.  Table 
5.28  lists  c  values  which  we  feel  are  conservative  for  testing 
residuals.  Although  the  choice  of  c  can  be  troublesome,  it  is  this 
flexibility  of  c  which  can  make  the  test  statistic  T^c)  insensitive 
to  the  structural  model;  this  makes  T^(c)  unique  among  tests  for 
Gaussianity  since  no  other  test  can  be  "tuned"  to  the  model. 


•■31  :  l-OCOT 
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FIGURE  5.1.  1  he  90  Percent  Point  of  Tj(c)  Versus  Model  Or 

tor  p-Parameter  Linear  Regressions  with  Gauss 
Errors ,  Sample  Size  =  04,  and  c  =  0.05 


Samp 


Krrors,  Sample  Size 


for  p-Parameter  Linear  Regressions  witli  Gaussian 
Krrors,  Sample  Size  =  64,  and  c  =  0.05 
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TA8LE  5.17a 

Size  of  the  Test  T -j  ( c )  Versus  Model  Order  for  Linear 
Regressions  with  1,  2,  4,  and  7  Parameters, 
Sample  Size  =  128,  and  c  =  0.025 


Size  of  the  Test 


Model 

Order 

0.1 

0.05 

0.025 

0.01 

1 

0.099 

0.051 

0.025 

2 

0.099 

0.051 

0.026 

0.010 

4 

0.100 

0.051 

0.025 

0.010 

7 

0.104 

0.052 

0.027 

0.01 1 

Size  of  the  Test 
Regressions  ’ 
Sample 

TABLE  5.17b 

T i ( c )  Versus  Model  Order  for  Linear 
with  1,  2,  4,  and  7  Parameters, 

Size  =  128,  and  c  =  0.01 

Size  of  the  Test 

Mode  1 
Order 

0.1 

0.05 

0.025 

0.01 

1 

0.051 

0.026 

2 

0.098 

■IWIWW 

0.026 

0.010 

4 

0.098 

0.025 

7 

0.104 

S 

1 


3 


V’ 


TABLE  5.17c 

Size  of  the  Test  T-j(c)  Versus  Model  Order  for  Linear 
Regressions  with  1,  2,  4,  and  7  Parameters, 
Sample  Size  =  128,  and  c  =  -0.01 


Size  of  the  Test 


Model 


Order 

0.1 

0.05 

0.025 

0.01 

1 

0.098 

0.051 

0.025 

0.009 

2 

0.097 

0.050 

0.025 

0.009 

4 

0.097 

0.050 

0.025 

0.009 

7 

0.101 

0.049 

0.025 

0.009 

TABLE  5  . 1 7 d . 

Size  of  the  Test  T -j  { c )  Versus  Model  Order  for  Linear 
Regressions  with  1,  2,  4,  and  7  Parameters, 
Sample  Size  =  128,  and  c  =  -0.025 


Size  of  the  Test 


Model 


Order 

0.  i 

0.05 

0.025 

1 

0.099 

0.051 

0.025 

2 

0.099 

0.050 

0.025 

4 

0.098 

0.048 

0.025 

/ 

0.101 

0.047 

0.023 

r 
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TABLE  5.18a. 

Size  of  the  Test  Ti(c)  Versus  Model  Order  for  Linear 
Regressions  with  1,  2,  4,  and  7  Parameters, 
Sample  Size  =  64,  and  c  =  0.025 


Size  of  the  Test 
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Model 

Order 

0.1 

0.05 

0.025 

0.01 

1 

0.097 

0.047 

0.H23 

0.010 

2 

0.096 

0.048 

0..  -'5 

0.010 

4 

0.098 

0.048 

0.02b 

0.011 

7 

0.097 

0.052 

0.029 

0.011 

%  V 
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Model 


Order 

0.1 

0.05 

0.025 

0.01 

1 

0.097 

0.047 

0.023 

0.010 

2 

0.095 

0.047 

0.024 

0.010 

4 

0.094 

0.045 

0.024 

0.010 

7 

0.095 

0.047 

0.024 

0.008 

TABLE  5 . 1 8d 

Size  of  the  Test  T -j  ( c )  Versus  Model  Order  for  Linear 
Regressions  with  1,  2,  4,  and  7  Parameters, 
Sample  Size  =  64,  and  c  =  -0.025 


Size  of  the  Test 
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TABLE  5.19a 

Size  of  the  Test  Ti(c)  Versus  Model  Order  for  Linear 
Regressions  with  1,  2,  4,  and  5  Parameters, 
Sample  Size  =  32,  and  c  =  0.025 


Size  of  the  Test 


Model 

Order 

0.1 

0.05 

0.025 

0.01 

1 

0.098 

0.048 

0.023 

0.09 

2 

0.100 

0.048 

0.023 

0.09 

4 

0.100 

0.048 

0.023 

0.09 

5 

0.101 

0.049 

0.023 

0.09 

TABLE  5.19b 

Size  of  the  Test  Tj(c)  Versus  Model 
Regressions  with  1,  2,  4,  and  5 
Sample  Size  =  32,  and  c  = 

Size  of  the 

Order  for  Linear 
Parameters , 

0.01 

T  est 

Model 

Order  0.1 

0.05 

0.025 

0.01 

1  0.097 

0.049 

0.023 

0.009 

2  0.100 

0.047 

0.022 

0.009 

4  0.096 

0.046 

0.021 

0.009 

5  0.096 

0.046 

0.020 

0.008 

TABLE  5.19c 


Size  of  the  Test  T -|  ( c )  Versus  Model  Order  for  Linear 
Regressions  with  1,  2,  4,  and  5  Parameters, 
Sample  Size  =  32,  and  c  =  -0.01 


Size  of  the  Test 


TABLE  5 . 1 9d . 

Size  of  the  Test  T-|(c)  Versus  Model  Order  for  Linear 
Regressions  with  1,  2,  4,  and  5  Parameters, 
Sample  Size  =  32,  and  c  =  -0.025 


'lode  1 
Order 


Size  of  the  Test 


0.1 

0.05 

0.025 

0.01 

0.101 

0.048 

0.024 

0.009 

0.103 

0.046 

0.022 

0.009 

0.097 

0.042 

0.018 

0.007 

0.098 

0.040 

0.016 

0.005 
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TABLE  5.20a 

Size  of  the  Test  T-|(c)  Versus  Model  Order  for 
Autoregressions  with  1,  2,  4,  and  8  Parameters, 
Sample  Size  =  120,  and  c  =  0.025 


Size  of  the  Test 


vi 

la 


0. 

0. 

0. 
0.104 


0.023 

0.024 

0.027 

0.029 


TABLE  5.20b 

Size  of  the  Test  Tj(c)  Versus  Model  Order  for 
Autoregressions  with  1,  2,  4,  and  8  Parameters 
Sample  Size  =  120,  and  c  =  -0.025 


Size  of  the  Test 


TABLE  5.21a 


Size  of  the  Test  T](c)  Versus  Model  Order  for 
Autoregressions  with  1,  2,  4,  and  8  Parameters, 
Sample  Size  =  60,  and  c  =  0.025 


Size  of  the  Test 


idel 


-der 

0.1 

0.05 

0.025 

0.01 

1 

0.100 

0.050 

0.025 

0.010 

2 

0.104 

0.054 

0.025 

0.008 

4 

0.103 

0.053 

0.027 

0.009 

8 

0.107 

0.061 

0.032 

0.011 

TABLE  5.21b 

Size  of  the  Test  T](c)  Versus  Model  Order  for 
Autoregressions  with  1,  2,  4,  and  8  Parameters, 
Sample  Size  =  60,  and  c  =  -0.025 


Size  of  the  Test 


del 


"der 

0.1 

0.05 

0.025 

0.01 

1 

0.098 

0.049 

0.025 

0.009 

2 

0.103 

0.051 

0.024 

0.008 

4 

0.099 

0.048 

0.024 

0.008 

8 

0.102 

0.052 

0.052 

0.008 
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TABLE  5.22a 

Size  of  the  Test  T](c)  Versus  Model  Order  for 
Autoregressions  with  1,  2,  4,  and  8  Parameters, 
Sample  Size  =  40,  and  c  =  0.025 


Size  of  the  Test 


Mod  e  1 
Order 


0.100 
199 
02 
04 


0.048 

0.051 

0.053 

0.058 


TABLE  5.22b 


0.025 


0.025 

0.027 

0.030 

0.032 


0.011 
0. 
0.013 
0.013 


Size  of  the  Test  T-|(c)  Versus  Model  Order  for 
Autoregressions  with  1,  2,  4,  and  8  Parameters, 
Sample  Size  =  40,  and  c  =  0.01 


Size  of  the  Test 


lei 

1  e  r  0.1 

0.05 

0.025 

0.0 

0.100 

0.047 

0.024 

0.0 

1  0.100 

0.051 

0.027 

0.0' 

1  0.101 

0.052 

0.029 

0.0' 

1  0.101 

0.054 

0.029 

0.0 

I 

r 

y 


TABLE  5.22c 

Size  of  the  Test  T-j(c)  Versus  Model  Order  for 
Autoregressions  with  1,  2,  4,  and  8  Parameters 
Sample  Size  =  40,  and  c  =  -0.01 


Size  of  the  Test 


0.047 

0.051 

0.050 

0.049 


25 
27 
0.027 
0.025 


TA8LE  5 . 22d 

Size  of  the  Test  T-|(c)  Versus  Model  Order  for 
Autoregressions  with  1,  2,  4,  and  8  Parameters 
Sample  Size  =  40,  and  c  =  -0.025 


Size  of  the  Test 


25 

0. 

26 

0. 

27 

0. 

26 

0. 

23 

0. 

c 

0.1 

0.05 

0.025 

0.01 

0.05 

0.104 

0.054 

0.027 

0.011 

0.025 

0.097 

0.046 

0.023 

0.008 

0.01 

0.095 

0.043 

0.020 

0.007 

-0.01 

0.090 

0.039 

0.017 

0.006 

-0.025 

0.091 

0.037 

0.015 

0.005 

-0.05 

0.090 

0.037 

0.014 

0.003 
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TABLE  5.24a 


Size  of  the  Test  T-|(c)  Versus  Critical  Parameter  for  a 
2-Dimensional,  3x4  Two-Way  Layout  with 
Interaction,  Sample  Size  =  120 


Size  of  the  Test 


c 

0.1 

0.05 

0.025 

0.01 

0.025 

0.107 

0.052 

0.025 

0.011 

0.008 

0.102 

0.046 

0.021 

0.008 

-0.008 

0.093 

0.041 

0.019 

0.007 

-0.025 

0.088 

0.036 

0.017 

0.006 

TABLE  5.24b 


Size  of  the  Test  T-j  ( c )  Versus  Critical  Parameter  for  a 
2-0imensional  3x4  Two-Way  Layout  with 
Interaction,  Sample  Size  =  60 


Size  of  the  Test 


c 

0.1 

0.05 

0.025 

0.01 

0.025 

0.110 

0.055 

0.026 

0.010 

0.008 

0.097 

0.045 

0.021 

0.007 

-0.008 

0.087 

0.037 

0.016 

0.006 

-0.025 

0.077 

0.031 

0.013 

0.004 

TABLE  5.25 

Power  Results  of  the  Statistic  T](c)  for 
Unstructured  Data  (m);  Autoregressi ve  (AR)  Models  with  1, 
2,  and  4  Parameters;  a  Two-Way  (TW)  Layout  Model  with  and 
Without  Interaction;  Sample  Size  is  120;  and  a  =  0.05 


0.05 

0.025 

0.025 

0.05 


.05 

0.' 

.025 

0.' 

.025 

0. 

.05 

o.: 

.37 

0. 

,38 

0. 

38 

.37 

0. 

,38 

0. 

,37 

.37 

0. 

,37 

0. 

.36 

.36 

0. 

,37 

0. 

35 

0.40  X2(10) 


0.05 

0.78 

0.75 

0.75 

0.025 

0.77 

0.75 

0.75 

0.025 

0.76 

0.75 

0.75 

0.05 

0.  lb 

0.  75 

0.  lb 

240 


TABLE  5.26 

Power  Results  of  the  Statistic  T-|(c)  for 
Unstructured  Data  (m);  Autoregressi ve  (AR)  Models  with  1, 
2,  and  4  Parameters;  a  Two-Way  (TW)  Layout  Model  with  and 
Without  Interaction;  Sample  Size  is  60;  and  a  =  0.05 


c 

m 

AR(  1 ) 

AR(  2) 

AR  ( 4 ) 

TW(8) 

TW( 20) 

0.025 

0.25 

0.25 

0.24 

0.23 

0.22 

0.18 

0.01 

0.25 

0.25 

0.24 

0.22 

0.21 

0.17 

-0.01 

0.24 

0.25 

0.23 

0.22 

0.19 

0.15  t( 9 ) 

-0.25 

0.24 

0.24 

0.23 

0.21 

0.19 

0.15 

0.025 

0.50 

0.50 

0.51 

0.47 

0.46 

0.41 

0.01 

0.50 

0.50 

0.50 

0.47 

0.45 

0.39 

-0.01 

0.49 

0.49 

0.49 

0.46 

0.44 

0.36  t( 5) 

-0.025 

0.48 

0.49 

0.49 

0.45 

0.42 

0.35 

0.025 

0.29 

0.27 

0.27 

0.27 

0.26 

0.23 

0.01 

0.28 

0.27 

0.25 

0.21 

-0.01 

0.28 

0.27 

0.27 

0.24 

0.18  X2(10) 

-0.025 

0.27 

!■»■'  'k'4 

0.26 

0.24 

0.18 

0.025 

0.56 

0.53 

0.52 

0.51 

0.48 

0.42 

0.01 

0.55 

0.53 

0.52 

0.51 

0.46 

0.40 

0.01 

0.54 

0.53 

0.51 

0.50 

0.45 

0.37  X2 ( 4 ) 

0.025 

0.52 

0.53 

0.51 

0.50 

0.44 

0.35 

TABLE  5.27 

Power  Results  of  the  Statistic  T-j(c)  for 
Unstructured  Data  (m) ;  Autoregressive  (AR)  Models  with  1, 
2,  and  4  Parameters;  Sample  Size  is  40;  and  a  =  0.05 


-0.025 


0.01 

-0.01 

-0.025 


0. 

0.17  t( 9 ) 

0. 

0. 


0.38 

0.38 

0.37 

0.34 

0.38 

0.37 

0.36 

0.33  t(  5) 

0.37 

0.37 

0.36 

0.33 

0.37 

0.37 

0.35 

0.32 

0.3 
0.3 
0.35  Xc(4) 

0.35 
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5 . 6  The  Effect  of  the  Model  Selected 


The  analysis  thus  far  has  assumed  that  the  true  model  was  used  to 
fit  the  data.  In  practice,  a  selection  criteria,  such  as  the  PSIC 
criterion  of  (2.4.17),  is  used  to  select  a  model  from  a  set  of 
candidate  models.  The  analysis  and  tests  of  fit  are  applied  to  the 
selected  model.  To  examine  the  effect  of  the  fitted  model  on  the 
statistic  T^(c),  four  parameter  linear  regression  and  autoregression 
models  were  fit  with  models  of  2,  3,  4,  5  and  6  parameters.  For 
c  =  -0.025  and  0.025,  Figures  5.13  to  5.16  are  plots  of  the  .95 
percentage  point  of  T^(c)  as  a  function  of  the  fitted  model  order. 

If  the  data  are  fitted  with  the  true  model  or  a  larger  model 
containing  the  true  model,  the  percentage  points  are  approximately  the 
same,  as  seen  in  the  plots.  For  autoregressions,  the  plots  indicate 
that  the  percentage  points  increase  when  the  data  are  overfitted 
regardless  of  the  sign  of  c.  The  plots  also  indicate  that  the  test  is 
liberal  when  the  data  are  overfitted;  a  smaller  value  of  c  will  reduce 
the  effect  on  1(c)  of  overfitting  the  data.  For  the  four  parameter 
I  ''near  r  eyres ,  ion  data,  the  percentage  points  indicate  that  f  ( c )  is 
a  conservative  test  when  the  data  are  fitted  at  the  true  or  larger 
model.  For  both  linear  regressions  and  autoregress i ons ,  the  figures 
show  that  the  test  is  very  conservative,  in  general,  when  the  data  are 
underfit.  This  is  not  always  true,  as  seen  in  Figures  5.13  and  5.14 
for  the  three -parameter  linear  regression  model  fit  to  data  produced 
by  a  four -parameter  linear  regression  mod^l.  From  the  plots  and  the 
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above  discussion,  we  see  that  overfitting  the  data  is  preferred  since 
the  test  statistic  becomes  stable  when  the  fitted  model  contains  the 
true  model . 


The  effect  of  the  selected  model  on  T^c)  was  analyzed  via  Monte 
Carlo  simulation  by  examining  autoregressi ve  models  with  1,  2  and  4 
parameters.  For  each  model,  T^c)  was  tabulated  for  the  selected 
model.  The  model  was  selected  using  the  PSIC  criterion  of  (2.4.17) 


with  s(c)  =  (1  +  c).  For  a  sample  size  of  40,  Table  5.29  presents  the 
size  of  the  test  results  for  c  =  -0.025  and  0.025;  these  results  are 


in  good  agreement  with  the  size  of  T  ( c )  for  unstructured  data.  If 
larger  sample  sizes  are  used,  the  agreement  between  the  size  of  the 
test  T^(c)  for  structured  and  unstructured  data  improves;  also, 
larger  c  values  can  be  used. 


For  the  above  models,  a  power  study  was  performed  on  T^c) 
obtained  from  the  selected  model.  The  alternative  distributions  were 
the  chi  -square  distribution  with  4  and  10  degrees  of  freedom,  and  the 
t  -d i s t r i but i on  with  5  and  9  degrees  of  freedom.  Ihe  power  results  are 
shown  in  Table  5.30.  It  can  be  seen  that  these  results  compare 
favorably  with  the  results  in  Tables  5.25  to  5.27  where  the  model  is 
known . 


rsrvwfj 


1  000*1  l  l  -0.  025  J 

-  4C  4.  70  b 


249 


TABLE  5.29a 

Size  of  the  Test  T-j(c)  Using  the 
Selected  Model  for  Autoregressi ve  Processes  with 
1,  2,  and  4  Parameters;  Sample  Size  is 
40;  and  c  =  0.025 


Size  of  the  Test 

p  0.1  0.05  0.025  0.01 


1  0.104 

2  0.106 

4  0.109 


0.052  0.027 
0.052  0.028 
0.058  0.032 


0.012 

0.013 

0.014 


TABLE  5.29b 

Size  of  the  Test  T  -j  ( c )  Using  the 
Selected  Model  for  Autoregressive  Processes  with 
1,  2,  and  4  Parameters;  Sample  Size  is 
40;  and  c  =  0.01 


Size  of  the  Test 


mnRanmm 

11»hSmS1 

HBWilEl 

TABLE  5.29c 


Size  of  the  Test  T-|(c)  Using  the 
Selected  Model  for  Autoregressive  Processes  with 
1,  2,  and  4  Parameters;  Sample  Size  is 
40;  and  c  =  -0.01 


Size  of  the  Test 


p 

0.01 

1 

0.102 

0.051 

0.026 

wm 

2 

0.105 

0.050 

0.026 

4 

0.102 

0.051 

0.027 

TABLE  5 . 29d 

Size  of  the  Test  Tj(c)  Using  the 
Selected  Model  for  Autoregressive  Processes  with 
1,  2,  and  4  Parameters;  Sample  Size  is 
40;  and  c  =  -0.025 


Size  of  the  Test 


251 


TABLE  5.30 

Power  Results  of  the  Statistic  T •)  ( c )  for 
Unstructured  Data  (m);  the  Selected  Model  for 
Autoregressi ve  (AR)  Processes  with  1,  2,  and  4 
Parameters;  Sample  Size  is  40;  and  a  =  0.05 


0.22 

0.22 

0.20 

0.21 

0.20 

0.21 

0.20 

025 

0 

01 

0 

01 

0 

0.3 

0.33  X2 ( 4 ) 
0.3 
0.3 


I 

to 


5 . 7  Discussion 

The  procedures  presented  here  and  in  Part  2  provide  a  means  to 
simultaneously  analyze  the  goodness  of  fit  of  both  the  parametric  and 
distributional  form  of  a  model.  The  model-critical  selection 
criterion,  PSIC,  and  the  goodness  of  fit  statistic,  T^c),  are 
complementary  procedures.  The  PSIC  procedure  selects  the  best 
parametric  model  consistent  with  Gaussianity.  After  the  model  is 
selected,  the  statistic  T  ( c )  is  used  to  test  the  normality  of  the 
residuals.  Since  both  the  parametric  and  distributional  form  of  the 
model  are  used  to  obtain  the  test  statistic  T ^ ( c )  ,  the  test  jointly 
examines  both  parts  of  the  model.  For  testing  the  residuals  with 
T.|(c),  small  values  of  |c|  should  be  used.  If  the  test  rejects  the 
normality  of  the  residuals,  larger  values  of  |c|  should  be  used  to 
obtain  model -cri tical  parameter  estimates  and  model -cri tical  weights. 
These  estimates  and  weights  can  be  used  aid  in  determining  why  the 
test  rejected  normality.  In  Part  6,  the  selection  and  testing 
procedures  will  be  applied  to  experimental  data. 
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PARI  6 

APPLICATIONS  OF  MODEL-CRITICAL 
SELECTION  AND  TEST  OF  FIT 


6 . 1  Introduction 

In  this  part,  our  selection  criterion  and  test  of  fit  will  be 
applied  to  some  experimental  data.  The  model-critical  parameter 
estimates  and  weights  will  also  be  used  in  analyzing  the  adequacy  of 
the  assumed  model . 


The  analysis  of  a  set  of  data  involves  fitting  the  data  with  a  set 
of  candidate  models  for  several  values  of  c.  The  model  is  selected 
using  the  PSIC  criterion.  If  the  model  selected  depends  on  c,  the 
model  with  the  larger  number  of  parameters  is  selected  since 
underfitting  the  data  is  more  serious  than  overfitting  the  data.  For 
the  selected  model,  the  test  statistic  T  (c)  is  calculated  and 
compared  to  the  appropriate  percentage  point.  If  the  normality  of  the 
residuals  is  rejected,  the  analysis  continues  by  examining  the 
mod  el  critical  parameter  estimates  and  weights,  probability  plots  of 
the  residuals,  and  plots  of  the  residuals  versus  fitted  value,  for  a 
range  of  c  values.  Of  these  diagnostics,  the  critical  weights  are 
valuable  in  identifying  inconsistencies  between  the  data  and  the 
assumed  model.  In  fact,  regardless  of  the  test  of  fit  result,  the 
model -c ri t ica 1  parameter  estimates  and  weights  should  be  examined  since 
they  may  expose  a  problem  with  the  data  and  assumed  model  not  detected 
by  the  statistic  T 1  ( c ) . 
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6 . 2  An  ARMA(p.q)  Example 

The  time  series  of  197  chemical  process  readings  from  Series  A  in 
Box  and  Jenkins  (1970)  is  examined  in  this  section.  The  observations 
were  sampled  every  two  hours  and  are  shown  in  Figure  6.1.  Examination 
of  the  autocorrelations  and  partial  correlations  suggests  an  ARMA(1,1) 
model  (See  Box  and  Jenkins,  1970  for  details).  To  verify  this,  the 
data  were  fit  with  ARMA(p,q)  models  for  p  <  2  and  q  <  2.  For  c  =  0.1, 
Fable  6.1  is  a  list  of  the  models  considered,  the  mode  1 -c ri t ica 1 
estimate  of  the  error  variance,  and  the  value  of  the  model -critical 
selection  criterion  with  s(c)  =  (1  +  c). 


TABLE  6.1 

The  Models  Considered,  Innovations  Variance  Estimate, 
and  PSIC  Criterion  with  s(c)  =  Ik  for  the 
Chemical  Process  Data;  c  =  0.1 


MODEL 


S2(0.1)  PSIC 


ARMA( 1 ,0) 

0.102 

-2.26 

ARMA( 0 , 1 ) 

0.  124 

-2.07 

ARMA( 1,1) 

0.094 

-2.32 

ARMA( 2,0) 

0.096 

-2.30 

ARMA( 0,2) 

J.  108 

-2.18 

ARMA( 2,1) 

0.094 

-2.31 

ARMA( 1  ,2) 

0.094 

-2.31 

ARMA( 2,2) 

0.093 

-2.30 

From  lable  6.1,  it  can  be  seen  that  the  ARMA (1,1)  model  is  selected  by 
the  PSIC  criterion.  For  this  model,  the  value  of  the  test  statistic 
lj(c)  is  0.102  which  exceeds  the  .95  percentage  point  of  0.089  (See 


Lawrence,  Paulson,  and  Swope,  1986  for  percentage  points  of  T^(c) 

with  sample  sizes  larger  than  120);  therefore,  we  would  reject  the 

normality  of  the  residuals.  It  is  noted  that  T^(0.05)  is  0.768x10  2 

-2 

which  is  less  than  the  0.90  percentage  point  of  1.60x10  .  For  a 

fixed  value  of  c,  the  analysis  of  Part  5  shows  that  the  percentage 
points  of  ( c )  for  structured  data  tend  toward  the  percentage 
points  of  T,(c)  for  unstructured  data  as  the  sample  size  increases. 
Also,  for  a  fixed  sample  size,  the  goodness  of  fit  test  becomes 
increasingly  liberal  as  c  increases.  Since  the  test  rejected 
normality  for  c  =  0.1  and  not  for  c  =  0.05,  it  is  possible  that  the 
rejection  was  due  to  the  value  of  c  being  too  large.  However, 
considering  the  large  number  of  observations  and  the  small  number  of 
parameters,  we  feel  that  the  test  result  for  c  =  0.1  is  valid.  Thus, 
we  must  trade  off  some  power  of  the  test  for  a  conservative  test. 
Figures  6.2a  to  c  are  plots  of  the  maximum  likelihood  and 
model -critical  residuals  for  the  ARMA( 1,1)  model;  they  indicate 
possible  outliers  at  observations  43  and  64. 

Our  analysis  continues  by  examining  the  mode  1  -c r : t i c a  1  and  maximum 

likelihood  parameter  estimates  for  the  ARMA(1,1)  model;  Table  b.2 
presents  the  mode  1  -c ri t i ca 1  parameter  estimates  for  c  =  0,  0.1,  0.2, 
0.3  and  0.4. 
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TABLE  6.2 


Maximum  Likelihood  (c  =  0)  and  Model-Critical  (c  *  0) 
Estimates  of  the  ARMA(1,1)  Model  used  to  fit  the 
Chemical  Process  Data 


c  a  i  ( c ) 


0 

0. 

.908 

0. 

,1 

0, 

.905 

0. 

.2 

0. 

.892 

0. 

.3 

0, 

.887 

0. 

,4 

0 

.887 

b-i(c) 

s2(c) 

-0.576 

0.0977 

-0.547 

0.0946 

-0.508 

0.0916 

-0.486 

0.0886 

-0.460 

0.0870 

Noting  the  analysis  in  Section  3.4  of  the  simulated  ARMA(2,1)  process, 

2 

with  t-di stri buted  errors,  the  changes  in  b^(c)  and  s  (c)  suggest 
that  the  residuals  have  a  heavy-tailed  distribution.  Figures  6.3a  anc 
6.3b  present  Gaussian  probability  plots  of  the  maximum  likelihood  and 
model-critical  residuals;  the  critical  residuals  were  obtained  using  c 
=  0.4.  From  the  plots,  the  residual  distribution  appears  to  have  a 
heavy  right  tail  and  a  short  left  tail.  For  c  =  0.4,  Figure  6.4  is  a 
plot  of  the  model -critical  weights.  Figures  6.3  and  6.4  indicate  that 
observations  43  and  64  may  be  outliers.  Inspecting  data  around 
observation  4  j  indicates  that  this  Observut  1  on  may  hjve  been  recorded 
incorrectly.  Observation  43  has  a  value  of  16.5,  but  the  observations 


on  either  side  have  values  about  17.5. 
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6 . 3  A  Linear  Regression  Example 
In  this  section,  we  use  the  PSIC  model  selection  criterion  and  the 

goodness  of  fit  test  statistic  T^(c)  to  analyze  the  abrasion 
resistance  of  rubber  data  in  Table  2.3.  From  (2.3.8),  the  full 
quadratic  model  is 

*  -  * 0  *  Vl  *  Vl  *  “3X2  *  V2  *  Vlx2  *  '•  (6'" 

For  regression  models  with  K  possible  independent  variables,  there  are 
2K  possible  models  (Daniel  and  Wood,  1980,  Chapter  6).  Using  the 
model  of  (6.1),  there  are  32  possible  regression  models.  For  the  model 
in  (6.1),  all  possible  regression  models  were  examined  by  the  PSIC 
selection  criterion  for  c  =  -0.1,  0.1,  0.2,  and  0.3.  Table  6.3 
presents  the  two  best  models  selected  by  the  PSIC  criterion  with 
s(c)  =  (1  +  c).  From  the  table,  it  can  be  seen  that  the  full  model  is 
selected  only  for  c  =  0.3. 

A  plot  of  abrasion  resistance  versus  silica  level,  shown  in  Figure 

6.6,  indicates  primarily  a  linear  relationship.  When  c  -  0.3,  the 

downweight  ing  of  observation  I  results  in  a  more  quadratic  character 

to  the  data  in  Figure  6.5;  the  Y  at  silica  level  equal  to  one  on  the 

plot  signifies  observation  1.  This  effect  can  also  be  seen  in  Table 

2 

2.4  where  the  coefficient  of  ,  a^,  increases  with  c. 

For  the  full  quadratic  model,  the  test  statistic  T  ^  ( c )  was 

-4 

=  -0.01  and  c  =  0  01,  and  yielded  0.963x10  and 
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The  Variables  in  the  Two  Best  Regression  Models  and 
Correspondi ng  PSIC  Values  for  c  =  -0.1,  0.1,  0.2, 
and  0.3;  Rubber  Abrasion  Resistance  Data 


Variables 
in  the 


Model 

c 

PSIC 

2 

■  1 ,  x2  >  x2 ’  Xl  x2 

-0.1 

?  2 

]  *  x-]  •  X2 '  X2  ’  X1  X2 

2.991 

x2 >  x2 ’  x  i  x2 


0.1 


3.053 
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x,  ,  x 


x,  x. 


r  r  2’  2’  i  2 


0.1 


3.167 


265 
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0.859x10  ,  respectively.  Using  Table  5.1  and  interpolating  between 

n  =  20  and  24  to  obtain  percentage  points  for  n  =  22,  we  see  that  both 
values  of  T  (c)  fall  well  below  the  0.75  percentage  points  of  the 
test  for  c  =  -0.01  and  c  =  0.01.  However,  the  small  weights  at 
observations  1  and  13  for  c  =  0.3,  and  the  change  in  parameter 
estimates  with  c  indicate  that  the  model  is  still  suspect.  In  fact. 
Table  2.4  shows  that  the  estimate  of  the  error  variance  increases  as  c 
increases  from  -0.1  to  0.1.  From  our  experience,  this  indicates  a 
short-tailed  or  broad-shouldered  distribution.  When  fitting  the  data 
with  a  structured  model,  this  suggests  that  additional  variables  may 
need  to  be  considered  in  the  model.  Next,  we  examined  Gaussian 
probability  plots  of  the  residuals  for  c  =  0,  0.1,  0.3;  the  plots  are 
shown  in  Figures  6.6a  to  6.6c.  The  plots  indicated  that  the  residual 
distribution  has  a  short  right  tail.  For  c  =  0,  observations  1  and  10 
stick  out  on  the  left  end  of  the  plot  in  Figure  6.6a;  however,  the 
residual  distribution  does  not  appear  to  be  distinctly  non -Gauss i an . 


The  plots  for  c  =  0.1  and  0.3  further  amplify  the  outliers  at 
observations  1  and  10. 


the  analysis  points  out  that  caution  should  be  exercised  when  using 
the  model.  If  the  model  is  to  be  used  for  prediction,  additional 
observations  at  silica  and  coupling  agent  levels  equal  to  one  should  be 
obtained.  Also,  the  possibility  of  additional  variables  should  be 
explored.  Suich  and  Derringer  (197/)  used  this  data  to  obtain  a 
response  surface  for  y  as  a  function  of  x^  and  x  .  Oelehanty 


6 . 4  Analysis  of  a  Two-way  Layout  Example 

In  this  section,  we  analyze  the  replicated  survival  time  data  for 
three  poisons  and  four  treatments  shown  in  Table  2.7.  The  PSIC 
criterion  is  used  to  select  the  full  model  with  Poison,  Treatment,  and 
PoisonxTreatment  effects,  or  the  additive  model  with  only  Poison  and 
Treatment  effects.  For  c  =  0.1  and  s ( c )  =  (1  +  c),  the  PSIC  values 
are  -3.03  and  -3.50  for  the  full  and  additive  models,  respectively. 

The  additive  model  is  selected  since  the  reduction  in  the  error 
variance  does  not  indicate  a  need  for  the  additional  parameters.  An 
interaction  plot  of  fitted  cell  means  for  c  =  0,  0.1,  0.2,  and  0.3  is 
shown  in  Figure  6.8.  The  plots  agree  with  the  PSIC  criterion  that  the 
additional  parameters  of  the  interaction  model  are  not  warranted. 

For  c  =  0.01  and  c  =  0.02,  the  values  of  the  test  of  fit  statistic 
T^ ( c )  are  0.0127  and  0.0289,  respectively.  Using  the  percentage 
points  in  Table  5.1  and  interpolating  between  40  and  60,  the  0.99 
percentage  points  at  n  =  48  are  0.0023  and  0.0093  for  c  -  O.01  and 
0.02,  respectively.  Both  values  of  the  test  statistic  exceed  the 
correspond i ng  0.99  percentage  point  of  T  (c)  and  we  reject  the 
normality  of  the  residuals.  Noting  the  difference  between  the 
calculated  test  statistic  and  the  corresponding  .99  percentage  point, 
we  can  see  the  effect  of  the  choice  of  c  on  the  value  of  T^(c). 
Gaussian  probability  plots  of  the  model -cri t ica 1  residuals  for  c  =  0 
and  0.4  are  shown  in  Figures  6.9a  and  b.  The  plots  confirm  the. 


non -Gaussian  character  of  the  data.  Figure  6.10  is  a  plot  of  the 


maximum  likelihood  (c  =  0)  residuals  versus  fitted  values.  Figures 
6.9  and  6.10  indicate  the  need  for  a  transf ormation . 

The  family  of  Box-Cox  (1964)  transformations, 

y(X)  =  (yX-l)  /  \  }x_1,  (6.2) 

is  considered  where  y  is  the  geometric  mean  of  the  observations  and  \ 
is  the  power  to  be  estimated.  Oelehanty  (1983)  estimates  the  power  \ 
by  maximizing  the  generalized  likelihood  L(c),  over  a  range  of  k 
values.  The  value  of  \  was  found  to  be  about  -0.75.  For  a  more 
meaningful  transformation,  the  value  of  X  =  -1  was  chosen;  this 
transforms  survival  times  into  death  rates.  An  additive  model  is  fit 

to  the  reciprocal  of  the  survival  times.  The  values  of  our  test 

.  -3  -3 

statistic,  for  this  model,  are  0.586x10  and  0.214x10  for 

c  =  0.01  and  0.02,  respectively.  Using  Table  5.1,  we  see  that  both 

values  are  considerably  less  than  the  0.75  percentage  points.  The 

mode  1 -c r i t i ca 1  weights  for  c  -  0.4  are  shown  in  Table  6.4  and  indicate 

that  the  additive  model  of  the  transformed  data  is  reasonable.  Figures 

6.11a  and  6.11b  are  probability  plots  of  the  residuals  for  c  -  0  and 

0.4,  respectively;  Figure  6.12  is  a  plot  of  the  residuals  (c  =  0) 

versus  fitted  value.  These  plots  confirm  the  normality  of  the 


residuals . 


TABLE  6.4 


Model-Critical  Weights  for  Death  Rates  with 
the  Additive  Model,  c  =  0.4 


T  reatment 


1 

2 

3 

4 

0.65 

0.99 

0.99 

0.45 

0.88 

0.96 

1.0 

0.99 

0.85 

1  .0 

0.68 

0.97 

0.94 

0.92 

0.45 

0.93 

Poison 
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6 . 5  Summary 

In  this  part,  we  have  illustrated  the  use  of  our  model  selection 
and  test  of  fit  procedures  to  analyze  experimental  data.  The  PSIC 
criterion  can  be  useful  for  selecting  regression  and  two-way  layout 
models  as  well  as  ARMA  models.  As  with  the  AIC,  the  PSIC  criterion 
can  be  applied  to  a  variety  of  parametric  models.  Our  test  of  fit  was 
shown  to  provide  a  measure  of  fit  between  the  data  and  the  selected 
model.  The  analysis  of  the  examples  yielded  some  surprising  results. 


For  the  linear  regression  example,  the  model -cri t i ca 1  weights 
indicated  the  presence  of  potential  outliers;  however,  the  PSIC 
selection  criterion  did  not  select  a  smaller  model  with  the  outliers 
downweighted.  For  data  witn  outliers  downweighted,  the  model  selected 
by  the  PSIC  criterion  is  the  one  that  best  fits  the  remainder  of  the 
data.  This  model  may  have  more  or  less  parameters  than  the  model 
selected  without  the  outliers  downweighted.  Also  the  test  of  fit 
indicated  that  the  quadratic  model  residuals  are  "supernormal" 
(Gentleman  and  Wilk,  1975).  This  can  result  when  there  are  additional 
variables  needed  in  the  model  or  when  a  small  sample  size  makes 
rejecting  normality  difficult.  The  above  comments  provide  areas  for 
further  research. 


raw. 


The  ARMA  time  senes  example  illustrated  that  the  analyst  should 
use  the  largest  possible  value  of  c  for  the  sample  size,  number  of 
parameters  and  dimension  of  the  data.  A  large  value  of  c  is  required 
to  adequately  criticize  the  data  and  the  model;  however,  the  value  of 
c  must  be  kept  small  in  order  that  the  goodness  of  fit  test  remains 
conservative . 
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